Concordance between Sources of Morbidity Reports: Self-Reports and Medical Records

As part of a 10-year follow-up study of morbidity following spouse bereavement, concordance between subject reports of their illness experience and that given by their doctors’ and other medical records has been assessed. Enumeration from medical records involved extensive and careful perusal of general practitioner, specialist, and hospital records while subject reports were aided by a structured questionnaire which helped to prompt subjects’ memories. The findings showed generally poor concordance between these two sources of morbidity data. Overall only 22% of disease events were found in both sources: of the diseases that did not match 65% were from the record source and 35% were from the self-report source. Despite finding that concordance rates varied with some subject and disease factors, concordance was always less than might be expected to occur by random chance (the throw of a coin). These findings have serious implications for epidemiological and pharmacoeconomic research involving morbidity history as they suggest that neither the subject nor their medical record can generally be assumed to provide a complete enumeration of morbidity burden. Indeed, irrespective of the significant factors under consideration, the maximum concordance reached in this study was 45.7%.

In this paper we have two basic aims. The first is to document the rate of concordant reports of disease events between patients and their medical records since it has substantial impact on the design of many epidemiological studies. The second was to examine correlates of this concordance in an attempt to understand the factors involved. Finally, in the discussion we relate our work to the results of others.
The problem addressed in this paper is not completely new since many epidemiological and other forms of health survey involve individuals' recall of their medical history. Indeed there are reports dating back as far as the mid 1950s on this topic (Kreuger, 1957). Kreuger (1957) reported poor concordance between household interview and physician records of chronic health conditions with most concordance rates below 50%. Gerbert et al. (1988) also found poor concordance between physician and patient recollection of medication regime. Nor have more recent reports suggested substantially different findings with Barr et al. (2009) reporting cardiovascular disease event concordance of approximately 68%. A difference between the current and past work in this area however is the inclusion of temporal matching such that this paper reports on patient versus medical record recall of disease events.

ColleCtion of data sourCes
Of 176 subjects who participated in an Australian study of the effect of bereavement on subsequent morbidity (Jones et al., 2010), 11 died during the follow-up period, 2 were lost, and 11 refused to participate, leaving 152 subjects. Of these, a further four subjects introduCtion In a 10-year follow-up study on the morbidity of bereaved and nonbereaved subjects, morbidity data were obtained from two original and independent sources (Jones et al., 2010). Morbidity information was collected from medical records (general practitioners, specialists, and hospital records); this was termed the record morbidity source. A systematic history of the disease episodes over the follow-up period was also obtained from subjects in an interview: this comprised the self-report morbidity source. Diseases found in these two sources were then matched on both the disease description and the year of occurrence. The findings of this study that relate to morbidity sequelae of spouse bereavement have been published elsewhere (Jones et al., 2010). Jones et al. (2010) also reported that both bereaved and non-bereaved subjects had the same rate of disease matching: only 22% of the diseases collected were found in both sources of data.
This result led us to investigate further the concordance between subjects and medical records and sources of variability in that concordance. Selecting the data source for a study is one of the critical steps in the design of a project. The choice should be partly motivated by the reliability of the source as well as the source's potential impact on the validity of the study. Matching between various sources of health related data has been reviewed (Harlow and Linet, 1989). In this review there were seven studies that were comparable with the current work although some of the self-report and medical record sources are not identical to ours. However, all these studies had a relatively short follow-up time and most followed only certain illnesses. The current paper was motivated to provide a better understanding, for future reference, of the concordance of morbidity sources when all diseases are followed-up over a long period.
were excluded either because they experienced no illnesses during the follow-up period (three) or could not put a time to the illness and therefore confirmation could not be achieved (one).
Sociodemographic and morbidity data were collected in a systematic interview with subjects; each disease that subjects had suffered over the last 11 years was recorded in addition to the year of its occurrence; this source of data was termed the "self-report" source. Note therefore that all references to sociodemographic measures pertain only to study subjects, not their doctors. The year of disease occurrence was used since it was felt that it was the smallest unit of time in which subjects could reliably recall diseases which might have occurred up to 11 years previously. The data collection forms were set out such that the subject was prompted to recall illnesses within each of the 17 major categories of the ICD-9 system. Under each system, most common diseases had been listed on the data collection forms so that the interviewer could ask about these and in doing so aid subjects' recall. The interviews were conducted independently, largely in person although in some cases by telephone, by two medical practitioners. The average follow-up time for selfreports was 10 years. Upon enrollment in the above-mentioned study, permission had been sought from subjects to approach their doctors and review their records with general practitioners and specialists as well as their hospital records; this source of data was termed the "medical record" source. The subject was asked to name all physicians they had seen and hospitals they had visited during the follow-up period and for permission to examine records held by these health services. In a few cases the general practitioner refused access to records but a complete profile was thought to be available from other physicians' and/or hospital records. Using these records, the medical practitioners working on this study then collected diseases found for the full follow-up period (or as much as was available), on a yearly basis for each subject. The record source had an average follow-up period of 8.4 years. Both record and self-report sources were later coded according to the ninth revision of the ICD.

disease MatChing Between data sourCes
The matching process was performed manually by two medical practitioners since it would have been difficult to use computerized matching to consider all possible laymen's terminology for diseases. Concordance between the two data sources was only considered to have occurred if both sources reported the same disease and year of occurrence for that disease. The possibility of allowing some further "latitude" in temporal coincidence was considered but rejected. As it is, concordance only requires both sources to report occurrence within a 12-month period; further latitude would arguably cease to be matching. More importantly, since many diseases may recur, perhaps annually, reports of the same disease, but in different years, may really be different disease incidents. The rate of concordance between sources used in this article is the number of concordant disease reports, as defined above, divided by the total number of disease reports from either source.

statistiCal analysis
Concordance between self-reports and their medical records was assessed by computation of the proportion of reported illness episodes which were reported by both sources. Examination of the effect of subject characteristics on the rate of confirmation has been examined using Binomial generalized estimating equations (GEE;Zeger and Liang, 1986). From these models we obtain odds ratio estimates and their 95% confidence intervals. The odds ratio is a measure of the magnitude and direction of the association between a given factor and the probability of concordance. An odds ratio greater than one indicates a positive association with concordance while an odds ratio less than one indicates a negative association. The GEE approach has been used as, in many cases, each subject contributed more than one illness episode to the analysis with the potential for non-zero intraclass correlation within subjects.

results overall ConCordanCe
As reported by Bartrop and co-workers (Jones et al., 2010), the lack of agreement between the self-report and record sources could not be ascribed to the morbidity of the bereaved being differentially reported relative to the non-bereaved subjects. An examination was performed of the overall rate of concordance between the two sources ( Table 1). As already mentioned, 300 (22%) of all diseases reported were found in both sources.
For those diseases that did not match, 65% were record reports while 35% were self-reports. Notably, since the record source reported the larger number of illnesses that were not confirmed (Table 1) this argues against the explanation of subject exaggeration.
There are many postulates that could explain why there is poor concordance between sources. The lack of agreement could be due to problems with the self-report source (e.g., problems with subjects' recall), problems with the record source (e.g., effects of subjects' care-seeking behavior or problems with doctors' and hospitals' record keeping), or problems with both. Some variables that could have an impact on subjects' recall, care-seeking behavior, and/or record keeping were collected in this study and are analyzed here to investigate their possible impact on concordance. We felt that care-seeking behavior potentially affects concordance as subjects who visit their doctor often, or many doctors, may, perhaps through hypochondriasis, be more particular in recording their own medical history. medical records soon after bereavement. The time between illness occurrence and data accrual in 1986 was studied because it can be thought of as the "forgetting" time. A longer interval between occurrence and interview yields a greater potential for it to not be recalled. The rate of concordance was therefore examined as a function of these two "times." Substantial variability in concordance was observed with both the number of years elapsed prior to and after the illness occurrence, but the association between elapsed time and concordance rate only reached statistical significance when relating rates of concordance to time prior to illness (prior, p = 0.04, OR = 0.51, 95% CI 0.27-0.97; after, p = 0.14).

soCiodeMographiC influenCes
To consider the possibility that social and lifestyle factors might have influenced recall or care-seeking behavior, we examined concordance as a function of a number of sociodemographic factors. These were age, sex, marital status (single, married, widowed, other), occupation of study subjects (none, pensioner, trades, domestic duties, professional, other), smoking (current, exsmoker, never, other), alcohol (never, occasionally, weekly, daily, other), body mass index (BMI), weekly net income ($0-100, $101-150, $151-200, $201-250, $251-300, >$300), subject's judgment of their financial state (poor, okay, affluent, other), and finally year of enrollment in the original study (1975)(1976)(1977). Of these, a number appeared to have a statistically significant effect on the concordance rate between subject recall and medical records.
Numerical results for factors with discrete categories are given in Table 2, while the direction and magnitude of the relationship between quantitative factors and the probability is given below in terms of odds ratios. These were age (p = 0.01, OR = 1.02, 95% CI 1.00-1.03), occupation of the subject (p = 0.001), subjects whose smoking status changed over the follow-up period were found to have higher rates of concordance than those who did not (p = 0.01), years since enrollment (p = 0.02, OR = 0.95, 95% CI 0.90-0.99), income (p = 0.046), finances (p = 0.03), number of visits to GPs (p = 0.005, OR = 1.01, 95% CI 1.00-1.01), and being on a chronic medication regime (p = 0.02). The clinical significance of these factors must be tempered however by noting that concordance was poor in all subgroups of all factors considered. For example, in occupation of subject, the best subgroup was that of domestic duties where 45% of illnesses were confirmed ( Table 2).

Mood influenCes
The influence of the subject's mood state on self-report and medical record concordance in reporting of illness events was considered, but none of the Spielberger State and Trait anxiety (Spielberger et al., 1970) scales, the CESD (Radloff, 1977), or Hamilton depression (Hamilton, 1960) scales showed evidence of an effect on concordance. As an alternate view of the effect of these scales we defined each subject as being in or outside of the clinically normal range (as defined by community norms) and computed concordance rates for each category of each scale. Only in the case of the Hamilton depression scale did a subject's classification appear to affect the rate of concordance as we found there was a lower observed rate of concordance for subjects in the normal range (23.3%) than those scoring high values (50.8%). However in both cases concordance rates were low.

tiMe elapsed and MeMory reCall
The time elapsed since the illness is also an intuitively likely factor in the rate of concordance either in a positive way because of an illness's proximity to a significant life event, i.e., bereavement, or in a negative way because of memory loss caused by the time elapsed between illness occurrence and data accrual at the end of follow-up. In the case of bereaved subjects, illnesses which occurred shortly after bereavement may be recalled more clearly and/or subjects may have made more visits to their medical practitioners at that time and therefore have more complete It could be that concordance is related to the type of disease. Some diseases may be more memorable than others from the patient's point of view, or more likely to prompt a medical visit, while others may be more likely to be recorded in medical records. For this reason concordance between sources was assessed within ICD-9 groupings. In total 1365 distinct illnesses were found in the follow-up of our subjects (self-report and records), these were spread through most ICD-9 categories. We examined variation among ICD-9 categories by comparing, for each category, the concordance rate observed in that category to the concordance rate observed among the other categories combined. The minimum agreement, excluding ill-defined illnesses ( Table 2), was for respiratory diseases ( Table 2) which differed from other disease categories (p = 0.001, OR = 0.51, 95% CI 0.34-0.76). The most common illnesses in this category were (i) among self-reports: URTI, bronchitis, and allergic rhinitis and (ii) among record reports: allergic rhinitis, chronic sinusitis, and bronchitis. The maximum agreement was among circulatory disorders ( Table 2) which also differed from other categories (p = 0.001, OR = 1.87, 95% CI 1.27-2.75). In this category the most common illnesses were essential hypertension and angina according to both self-and record reports. Although concordance in the psychiatric category did not differ statistically from other categories, it is worth recording that both subjects and records reported neurotic depression and depressive disorders not otherwise classified as the most common disorders.

duration of illness
We considered the possibility that long-term ailments (spread across more than one calendar year) may be more likely to be remembered by both sources since they have a greater impact on the patient and the physician is more likely to have seen the subject. We found a positive relationship between illness duration and rate of concordance which failed to reach statistical significance (p = 0.07, OR = 1.06, 95% CI 1.00-1.12). It should be noted however that in absolute terms the concordance rate among long-term illnesses is still only 31%.

MethodologiCal influenCes
We considered the potential for methodological issues to affect/bias the concordance between subjects and their physicians' reporting of morbidity. The only substantive methodological factor recorded in this study which might affect concordance between sources was the method of interview (MOI) of subjects and their physicians. As far as the subjects were concerned, the former was not a factor in this study however as 98% of subjects were interviewed in person. Nor did we find a statistically significant effect of the MOI of doctors on concordance (p = 0.5). Personal interview was found to have the highest observed rate of concordance. The clinical importance of this finding needs to be qualified by noting that doctors were generally interviewed in person and there were few exceptions. Secondly, all concordance rates were low ( Table 2).
In addition, we examined the possibility that the introduction of Medicare in Australia in January 1984 influenced memory or record keeping practices. Medicare failed however to affect the concordance rate (p = 0.34).

disCussion
Prospective collection of morbidity data can be expensive and logistically difficult so it is not surprising that retrospective data collection via existing sources is sometimes utilized. If the researcher does not themselves observe the measurements being recorded problems such as recall bias may occur. The problem of discordance among multiple sources of health-related data is not new, with studies dating back at least to the 1950s-1960s (Kreuger, 1957;National Centre for Health Statistics, 1965). However most of what little work is available on the agreement between morbidity sources is more recent such as Barr et al. (2009). Further, the question of concordance is not unique to morbidity sources but is found as an interest for many sources of data. Matching of medication regimen data as reported by patients and other data sources (Gerbert et al., 1988;Monpetit and Ray, 1988;Goodman et al., 1990), agreement between health status ratings as reported by subjects as compared with doctors ratings (Friedsam and Martin, 1963;Bergner et al., 1976;LaRue et al., 1979;Levkoff et al., 1987) are two other examples of researchers' interests with concordance between sources.
The question of whether medical records or patient recall are the preferred source of morbidity history does not seem to have been definitively answered, perhaps because there is no global answer and the relative accuracy of these two sources varies with both the nature of the research and the population from which subjects are sampled. Medical record systems and individual recall may put higher weight on different forms of illness. Therefore studies which rely on self-reported illnesses are limited in the accuracy of their results by the accuracy of human recall. Arguably, however, there may be cases where the reverse is true such as procedures or illnesses which are minor from a medical perspective but not be from the patient's perspective (O'Flaherty et al., 1987). This idea is supported by Coulter et al. (1985) who studied surgical procedures noted in GP records and Bryant et al. (1989) who studied illness recall in pregnancy. The presumption of record infallibility, or more directly, its correctness has potentially serious implications for any study of human morbidity. Formal study of this problem is also limited by the lack of a true gold standard against which to compare any given source. A number of authors have pointed out that to consider one source as being universally correct or incorrect is probably a substantial oversimplification. Tretli et al. (1982) point out that the language and phrasing of questions must be appropriate to the social and educational background of the subjects. Similarly Colditz et al. (1986) and Idler et al. (1990) note that better concordance is enhanced by clear diagnostic criteria and, in common with our own findings noted above, that more serious diseases tend to be more often confirmed than less serious disorders.
In our work we have found generally poor concordance between subjects and their medical records. The difference between a subject's reported recollection of their medical history and that given in their medical record is potentially influenced by many factors of which a number were studied in this work. A number of factors were found to affect rates of concordance including the time elapsed since an illness occurred, its proximity to other significant life events and the nature of the illness. Time since the illness occurrence might be important since both memory can fail with time and records be lost or mislaid. ICD-9 class was a factor whose influence was easily explicable. Notably, diseases in the circulatory be universally considered the more definitive source of historical morbidity information. An important consequence of this finding for study design is that retrospective data capture may lead to substantially inaccurate profile of patient morbidity compared with prospective data collection. While this idea is not a new one the current study quantifies the extent of inaccuracy possible using any form of morbidity recall.

ConClusion
Concordance between subjects and their medical records was examined in relation to a number of characteristics. A number of factors related to both the subject and their illness(es) were found to have statistically significant effects on the rate of subject-record concordance. In the former category were some intuitively obvious factors such as the age of the subject and the time elapsed between enrollment and occurrence of the disease. In addition we found some less obvious factors, such as evidence of increased care-seeking behavior, affected the rate of concordance. In this instance more visits to physicians was related with higher rates of concordance. In the latter category, notably, was the type of disease being recalled. Diseases which might be considered serious by both subject and their physician, such as cardiovascular disorders, exhibited greater concordance than less serious diseases such as upper respiratory tract infections.
The overall poor concordance found in our study has some consequence as both a methodological issue in epidemiological research and a public health question in terms of medical record keeping. In the former case, it is clear that neither subject selfreports nor medical records can be relied upon in a general health survey setting. It seems that both under-report some types of illness.

aCknowledgMents
We gratefully acknowledge financial assistance from the Staff Specialists Trust Fund, Area Health Human Research Trust Funds and the Department of Radiotherapy at the Northern Sydney and Central Coast Area Health Service, and the NSW Institute of Psychiatry. Our thanks also to Dr. Rosie Kubb who collected a substantial amount of the data. category, which are likely to be relatively serious, were associated with higher rates of concordance while those in the respiratory category, which includes minor URTI-type illnesses in addition to serious disorders such as lung cancer, were associated with lower rates of concordance. Other authors (Colditz et al., 1986;Bryant et al., 1989;Idler et al., 1990) have considered the nature of the illness and have found it to cause notable variability in concordance between patients accounts and record data. In particular Coulter et al. (1985) found that for concordance between GP records and patient recall on surgical procedures rates of 90% were observed when the year of occurrence was ignored and 82% when the year of the procedure was considered. There were also factors such as smoking status during the follow-up period for which there are plausible explanations, but which are not easily verified. We found that subjects whose smoking status had changed exhibited relatively high rates of concordance. This is possibly explained by smokers who have quit the habit ("other" category) becoming health conscious and therefore keeping better mental and/or written records of disease. Finally there were some factors such as income where the reason for its effect on concordance is not readily explained. While various indices of socio-economic state, in particular income, have been shown to have an effect on health outcomes (Idler et al., 1990), we can offer no intuitive or theory driven reason for it to affect concordance.
Why our data suggest quite low rates of concordance between patient and medical records remains an open question and while we have identified some sources of discordance others remain to be determined. These may include un-noted methodological causes, such as definitions, or differential bias in patient and medical record keeping.
Our results stand in some contrast to some other work which showed much higher rates of subject-record concordance, including that of Colditz et al. (1986) which found rates between 68 and 90% and Barr et al. (2009) who found 68% agreement for cardiovascular disease events. Our findings are however in line with some workers in this area, such as Tilley et al. (1985) and Kreuger (1957) who found quite low rates of concordance for some illnesses and higher for others. The clearest message from our data appears to be that neither patient recall nor medical records can