Treated post-acute sequelae after COVID-19 in a German matched cohort study using routine data from 230,256 adults

Background Post-acute sequelae after COVID-19 are still associated with knowledge gaps and uncertainties at the end of 2022, e.g., prevalence, pathogenesis, treatment, and long-term outcomes, and pose challenges for health providers in medical management. The aim of this study was to contribute to the understanding of the multi-faceted condition of long-/ post-COVID. It was designed to evaluate whether a prior SARS-CoV-2 infection during the first COVID-19 wave in Germany increases the rate of disease, as measured via a record of insurance data on diagnoses, symptoms, and treatment, in the subsequent 12 months compared with matched control groups without recorded SARS-CoV-2 infection. Method 50 outcome variables at disease, symptom and treatment levels (14 main categories and 36 sub-categories; new diagnoses) were defined from health insurance data. Logistic regression was carried out for two groups of patients tested positive in a PCR test in March/April 2020 for SARS-CoV-2, compared to the respective risk-adjusted (age, administrative region, 1:5 propensity-score matching), contemporaneous control group without prior documented SARS-CoV-2 infection (CG): First, individuals with outpatient treatment of acute COVID-19, indicating a not severe course (COV-OUT), and second, individuals with inpatient treatment of acute COVID-19, indicating a severe course (COV-IN) were compared with their respective control group. Results The mortality rate in COV-OUT (n = 32,378) and COV-IN (n = 5,998) groups is higher compared to their control groups with odds ratio (OR) 1.5 [95%CI (1.3, 1.6)] and 1.7 [95%CI (1.5, 1.8)] respectively. Both groups were more likely to have experienced at least one outcome compared to their CG [OR = 1.4, 95%CI (1.4, 1.4)]; OR = 2.5, 95%CI [2.4, 2.6]). 42/37 (COV-IN/COV-OUT) outcome variables showed increased ORs. COV-OUT: Loss of taste and smell [OR = 5.8, 95%CI (5.1, 6.6)], interstitial respiratory diseases [OR = 2.8, 95%CI (2.0, 4.1)] and breathing disorders [OR = 3.2, 95%CI (2.2, 4.7)] showed the highest ORs. COV-IN: Interstitial respiratory diseases [OR = 12.2, 95%CI (8.5, 17.5)], oxygen therapy [OR = 8.1, 95%CI (6.4, 10.2)] and pulmonary embolism/anticoagulation [OR = 5.9, 95%CI (4.4, 7.9)] were the most pronounced. Conclusion Following a SARS-CoV-2 infection during the first wave of the COVID-19 pandemic in Germany, 8.4 [COV-OUT, 95%CI (7.7, 9.1)] respectively 25.5 [COV-IN, 95%CI (23.6, 27.4)] percentage points more subjects showed at least one new diagnosis/symptom/treatment compared to their matched CG (COV-OUT: 44.9%, CG: 36.5%; COV-IN: 72.0%, CG: 46.5%). Because the symptoms and diagnoses are so varied, interdisciplinary and interprofessional cooperation among those providing management is necessary.

Background: Post-acute sequelae after COVID-19 are still associated with knowledge gaps and uncertainties at the end of 2022, e.g., prevalence, pathogenesis, treatment, and long-term outcomes, and pose challenges for health providers in medical management. The aim of this study was to contribute to the understanding of the multi-faceted condition of long-/ post-COVID. It was designed to evaluate whether a prior SARS-CoV-2 infection during the first COVID-19 wave in Germany increases the rate of disease, as measured via a record of insurance data on diagnoses, symptoms, and treatment, in the subsequent 12 months compared with matched control groups without recorded SARS-CoV-2 infection. Method: 50 outcome variables at disease, symptom and treatment levels (14 main categories and 36 sub-categories; new diagnoses) were defined from health insurance data. Logistic regression was carried out for two groups of patients tested positive in a PCR test in March/April 2020 for SARS-CoV-2, compared to the respective risk-adjusted (age, administrative region, 1:5 propensity-score matching), contemporaneous control group without prior documented SARS-CoV-2 infection (CG): First, individuals with outpatient treatment of acute COVID-19, indicating a not severe course (COV-OUT), and second, individuals with inpatient treatment of acute COVID-19, indicating a severe course (COV-IN) were compared with their respective control group. Results: The mortality rate in COV-OUT (n = 32,378) and COV-IN (n = 5,998) groups is higher compared to their control groups with odds ratio (OR) 1. 5  Conclusion: Following a SARS-CoV-2 infection during the first wave of the COVID-19 pandemic in Germany, 8

Introduction
In retrospect, the COVID-19 pandemic is described in different phases, whereby the first COVID-19 wave lasted from calendar week 10 to 20/2020, further COVID-19 waves followed (1). Modifications in the pathogen characteristics led to the emergence of virus variants (2). In Germany, more than 36 million SARS-CoV-2-positive cases were registered with the Federal Institute for Public Health (Robert Koch Institute), with 157,495 mortalities, corresponding to a case mortality rate of 0.43% (as per November 28 2022) (3). Sequelae after acute COVID-19 were initially referred to as long COVID in 2020 (4). In the meantime, research is being carried out worldwide into post-acute sequelae after COVID-19, whereby knowledge gaps and uncertainties concerning, for example, prevalence, pathogenesis, treatment as well as the longterm effects present healthcare professionals with coordinative, organisational and financial challenges concerning the medical management (5)(6)(7)(8).
Currently, different symptom-based definitions of long-/post-COVID exist. The National Institute for Health and Care Excellence categorises health problems that occur up to four weeks after the beginning of the disease as acute COVID, and between 4 and 12 weeks as "persistent COVID-19"/ long-COVID and from the 12th week congruent with the clinical case definition by the World Health Organisation (WHO) as post-COVID syndrome (9)(10)(11).
Data on the prevalence of sequelae following SARS-CoV-2 are heterogeneous and mostly based on non-controlled studies in which mainly symptoms were enquired about, or coded diagnoses were described without recording treatment needs such as administering medicines or prescribing therapies (12,13).
According to the lack of objective parameters to diagnose and identify post-COVID as of today, diagnoses are based on symptoms. The WHO identified fatigue, shortness of breath, neurocognitive and other symptoms as common post-COVID symptoms (11). Fatigue, dyspnoea, sleep disorders and myalgias are described as the most common symptoms persisting 12 months after infection (12), whereas fatigue, neurocognitive impairment, chest symptoms are prevalent after 6 to 12 months (14). Neuropsychiatric symptoms, pulmonary, liver, heart and kidney disorders, thrombosis, stroke, and embolism (15) were identified in a meta-analysis as additional post-acute COVID-19 sequelae. Analyses of follow-up routine data from hospitalised patients showed an increased risk of morbidity, mortality, and hospital readmission (16, 17).
The aim of the study was to evaluate whether a prior SARS-CoV-2 infection during the first COVID-19 wave in Germany increases the rate of disease, as measured via a record of insurance data on diagnoses, symptoms, and treatment, in the subsequent 12 months compared with matched control groups without recorded SARS-CoV-2 infection. Furthermore, sex differences were also to be investigated. The analysis is based on health insurance data of the AOK, Germany's largest statutory health insurance, which contains information on the use of health care facilities, including diagnoses and treatments from billing data.

Study design and setting
A matched cohort study was carried out of persons insured with the AOK, who were treated for COVID-19 as outpatients or inpatients. The AOK is the largest German statutory health insurance, covering about 30 per cent of the German population. Germany-wide billing data from outpatient and hospital care, prescriptions of medicines, medical aids and remedies, as well as the master data of AOK-insured persons for the period from April 2019 to June 2021 (index period) were analysed. Data of patients who had a positive polymerase chain reaction (PCR) test for SARS-CoV-2 were observed for twelve months (equivalent to four quarters) after infection (post-observation period), beginning 4 weeks after the COVID-19 diagnosis, which corresponds to the usual definition of post-acute sequelae after COVID-19 (9,10). In order to detect new (incident) cases or worsening diseases in the post-observation period, patients data were pre-observed across twelve months before their COVID-19 diagnosis (pre-observation period) (see Figure 1).

Sampling
Insured persons aged 18-99 years were selected if they had been insured with the AOK without interruption during the preobservation period until the start of post-observation and were then continuously insured either for the entire post-observation period or until their death (if this occurred before that). Four groups were distinguished: (1) The "COVID-19 outpatient" group included those who had a first outpatient diagnosis of U07.1 according to ICD-10-GM in April/May 2020 (first wave of the COVID-19 pandemic in Germany), had made use of an outpatient SARS-CoV-2 PCR test, and had not been treated in hospital between April-June 2020, indicating a not severe course of acute COVID-19 (COV-OUT). (2) The "COVID-19 inpatient" group included those who were treated for COVID-19 for the first time as inpatients in April/May 2020, indicating a severe course of acute COVID-19 (COV-IN). Cases with a principal diagnosis of respiratory failure, pulmonary embolism, viral infection, sepsis or renal failure and a secondary diagnosis of U07.1 were included in line with Guenster et al. (17), since the COVID-19 diagnosis cannot be documented as a principal diagnosis in inpatient billing data. 3 and 4) For each of these groups, a contemporaneous control group was formed of persons who visited their general practitioner at least once in April/May 2020 (non-users excluded), had no hospitalisation in April-June 2020 and had no COVID-19-related diagnosis during the entire observation period. Matching was used to form two risk-adjusted control groups for the two COVID-19 cohorts.

Outcome variables
Based on different factors described in the literature (15-24), 14 outcome categories consisting of 50 outcomes were defined in an interdisciplinary and interprofessional consensus process within the research group. The outcomes included new cases of acute disease, chronic diseases, symptoms, prescriptions of medicines, medical aids and remedies, as well as psychotherapy treatments and death.
Treated long COVID symptoms were operationalised by means of data referring to specific services covered by statutory health insurance, for example, respiratory disorders by means of medication or respiratory therapy, cardiac symptoms by cardiac co-treatments, thyroid diseases by medication (Supplement 1). The outcomes were operationalised as binary data (yes/no) and indicate whether an event occurred for the first time or worsened (renal insufficiency, hypertension) during the post-observation period compared to the pre-observation period (Supplement 1). The assignment of a nursing-care-dependency level in the preobservation period was recorded as a binary variable (yes/no) in order to perform an ex-post analysis.

Matching
To form the control groups, a two-stage matching procedure was defined and employed resulting in five control group matches for each individual in the two COVID-19 groups. For 38.376 individuals included in the COVID-19 groups, roughly 8 million individuals where available to choose statistical matches from.
First, an exact matching was carried out in the program R (version 4.1.1) with regard to age in years and the four-stage settlement structure of the administrative region of the insured person's place of residence. Age is described as a risk factor for developing post-acute symptoms after COVID-19 (25). The administrative region used in spatial research (26) provides information about the different accessibilities of outpatient and inpatient healthcare facilities between regions.
Secondly, propensity score matching was applied for morbidityrelated risk factors for severe COVID-19 courses that existed in the pre-observation period. The 35 disease groups specified in Roessler et al. (27) were used. The propensity score was determined using logistic regression and an optimal matching procedure was implemented using the R package optmatch (version 0.10.5).
The evaluation of standardized differences showed a very high degree of balance in the matched samples, due to the vast pool of possible matches. Therefore, the propensity score was not used for double adjustment in the subsequent logistic regressions.

Statistical analyses
For the description of the basic characteristics of the study groups as well as the distribution of the outcome variables, proportion values are given for categorical variables and mean and standard deviation for continuous variables. The effect of COVID-19 on each of the 50 outcome variables was estimated using logistic regressions for both the outpatient COVID group and the associated control group, as well as the inpatient COVID and control group. In order to identify sex differences, the regression models were extended to include the explanatory variable sex and the interaction sex*COVID-19. The propensity score was determined using logistic regression and an optimal matching procedure was implemented using the R package optmatch (version 0.10.5). If the interaction is significant, the effect of COVID-19 differs between women and men. In these cases, the COVID effect was ultimately estimated individually for women and men using logistic regression. Only outcome variables with a frequency (also within the sex categories) greater than or equal to 10 persons (28) were considered. In a sensitivity analysis, the additional influence of a pre-existing nursing-care dependency was examined. The data transformations and analyses were carried out using SQL and R (version 4.1.1). The reporting is based on the RECORD checklist (29). Table 1 shows the characteristics of the study population. A total of 8,392,550 insured persons were included, of which 32,378 were in the COVID-19 outpatient group and 5,998 in the COVID-19 inpatient group. The outpatient COVID-19 group, aged an average of 48.4 years, is significantly younger than the    Figure 4 shows the models that revealed a significant effect of the interaction of sex*COVID-19 in the second step of the logistic regression.

Descriptive results and logistic regression
The odds ratio (OR) of developing post-acute treated health outcomes at the disease, symptom or treatment level in the 12 months following COVID-19 is increased for 42 of 50 (COVID-19 inpatient group) and 37 of 50 (COVID-19 outpatient group) outcomes compared to the control group. With regard to most outcome variables, people who were treated as inpatients for COVID-19 are more affected. That at least one of the outcome variables occurs in an individual is more frequently the case in both the outpatient and inpatient COVID-19 group compared to their control groups ( Figure 2, the low rate of occurrence for some outcome variables must be taken into consideration, e.g., for interstitial pulmonary diseases (0.1% (outpatient COVID-19 group); 0.0% (control group)), loss of smell and taste (1.6%; 0.3%) or thromboses (0.4%; 0.3%). No evident differences are seen in the outpatient group in the main categories endocrinal diseases (incl. sub-categories), cognitive functional impairments/language disorders as well as in few sub-categories: obsessive-compulsive disorders, pulmonary embolism with anticoagulation, coronary heart disease, heart failure, transient ischaemic attacks and stroke. Because of the low number of cases, myocarditis, cerebral sinus vein thrombosis, intracerebral haemorrhage and myopathy were not evaluated (Figure 2).
In the inpatient group, the highest ORs are shown in the main category pulmonary diseases in interstitial lung diseases [OR = 12.  (4.4, 7.9)]. In the sub-category coronary heart disease there were no detectable differences in inpatients. Due to the small number of cases, it was not possible to analyse the subcategories obsessive-compulsive disorders, treatment of breathing disorders by prescribing remedies, myocarditis/ pericarditis, cerebral sinus vein thrombosis, intracerebral haemorrhage, myopathy and neuropathy. As shown in Figure 3, the small number of cases for some outcomes is to be considered, for example, interstitial respiratory diseases (1.8% (inpatient  Descriptive results and odds ratios of outcome variables in the outpatient sample. 1 Operationalisation of the outcome variables is described in the supplement 1. 2 OR = Odds Ratio; CI = Confidence Interval; * p <= 0.05; ** p <= 0.01; *** p <= 0.001; n.a. = not assessed when value <10 (also within the sex categories).   Descriptive results and odds ratios of outcome variables in the inpatient sample. 1 Operationalisation of the outcome variables is described in the supplement 1. 2 OR = Odds Ratio; CI = Confidence Interval; * p <= 0.05; ** p <= 0.01; *** p <= 0.001; n.a. = not assessed when value <10 (also within the sex categories). Among male compared to female individuals an increased OR is evident for the inpatient group in the categories psychiatric medication, interstitial respiratory diseases, thyroid diseases, hypertension and neuropathy (Figure 4). Only in the category diabetes mellitus is an increased OR shown for women compared to men.
To evaluate the possibility of distorted results, nursing dependency level was included in a sensitivity analysis as a proxy variable for a potentially increased pre-existing higher morbidity in the COVID-19 groups compared to the matched CGs (39). The odds ratios for outcomes relating to cardiac arrythmias, neuropathy, chronic pain, and mobility problems were attenuated slightly and the confidence intervals included 1. However, the 95% confidence interval for these outcome variables had already been very close to the value 1 beforehand. Furthermore, while the OR of the outcome variable death in both samples is still significantly increased, nevertheless it drops by circa 0.2 points (outpatient: OR = 1.

Discussion
After a SARS-CoV-2 infection, there is an increased occurrence of post-acute health sequelae in both the outpatient and inpatient COVID-19 groups compared to the contemporaneous control groups, showing greater severity after inpatient courses. Similar results are shown in US data, but with stratification by age and without consideration of treatment by medication, therapies, or FIGURE 4 Descriptive results and odds ratios of outcome variables showing significant sex differences in the outpatient and the inpatient sample. 1 Operationalisation of the outcome variables is described in the supplement 1. 2 OR = Odds Ratio; CI = Confidence Interval; * p <= 0.05; ** p <= 0.01; *** p <= 0.001; n.a. = not assessed when value <10 (also within the sex categories).
Müller et al. 10.3389/fepid.2022.1089076 Frontiers in Epidemiology psychotherapy (30). At the symptom level, the prominent symptom complex described in the literature is confirmed in the areas of lung, neurocognition/mental health and fatigue (14,22,23 There is an increased probability of occurrence of diseases with potentially acutely serious course such as myocardial infarction, stroke/ transient ischaemic attack, thrombosis and pulmonary embolism (inpatient), as well as myocardial infarction and thrombosis (outpatient). This is compatible with findings on hypercoagulability in the acute COVID phase (31) and health sequelae of thrombo-embolic and cardio-ischaemic nature after COVID-19 (15,16,20, 32) and clearly shows a risk beyond the acute COVID-19 phase of 4 weeks. Even though these complications in absolute numbers are described only rarely, bearing in mind the frequency of persistent symptoms (dyspnoea, chest pain) and at the same time increased mortality (7.7% vs. 4.8% inpatient; 2.4% vs. 1.6% outpatient), the field of tension between underdiagnostic and overdiagnostic in the everyday clinical setting becomes visible.
Especially after inpatient COVID-19 courses, chronic diseases in need of treatment are seen to a greater extent. Continuous connectivity with a view to the whole person as well as riskadapted controls (e.g., blood pressure, kidney values, indications of heart failure) appear to be a sensible medical measure here. The results show similarities to Ayoubkhani et al. (16).
Both outpatients and inpatients showed an increased OR for mental health disorders after COVID-19 compared to the control group. This is in line with other results from the literature (24, 33, 34); at the same time, there are also results for no (30) or a temporarily increased risk of affective disorders for patients under 65 years of age (34). The sex difference in post-acute health sequelae after COVID-19 is less pronounced than previous findings suggested (14,35). After an inpatient COVID-19 acute course, men dominate in the sex-specific significantly different categories, which could be due to more severe acute courses in men (36).

Strengths and weaknesses
The evaluation of all routine data of the AOKs results in a comprehensive database that includes more than 30% of the German resident population and comprises of utilization information regarding the majority of health care services in Germany. Limitations arise because routine data of the statutory health insurance do not show morbidity as such, but (I) treated and (II) documented morbidity after (III) insured persons have sought medical help, (IV) have explicitly mentioned their disease and (V) it has been correctly coded. Other possible biases could be: under-or over-reporting at the symptom level; residual confounding, lack of specific coding options (e.g., post exertional malaise, cognitive impairment); too narrow a grid of the selected outcome diseases (e.g., autoimmune diseases, rheumatic diseases as well as vasculitides are hardly recorded); under-reporting/distortion of the pandemic situation in the index period (possible non-testing bias in the control group due to changed utilisation behaviour in the first wave, leading to the possibility of individuals with an undiagnosed/undocumented SARS-CoV-2 infection being assigned to the control group); overestimation of the morbidity of the inpatient COVID-19 group due to hospitalisation. The underlying sample of AOK-insured persons corresponds to about one third of the total population and is by that representative for the German population in age and gender. However, a limitation of the representativeness of the population of the CG could have been introduced by the matching process, as controls were individuals who 1. visited their general practitioner in April/May 2020, 2. had no hospitalisation in April to June 2020 and 3. had no COVID-19-related diagnosis during the entire observation period. As the results are derived from data from the first wave of the pandemic, it can be assumed that both vaccination and changes in virus variants would alter analyses in subsequent periods. Current evidence suggests a lower probability of post-COVID after vaccination and after illness with the Omicron subtypes (37, 38).

Conclusions
In the routine data of the statutory health insurance, a very broad spectrum of post-acute treated health sequelae can be seen under the umbrella term "long-/post-COVID", including cardiac, neurological, psychological and thromboembolic diseases, fatigue, lung diseases and kidney function disorders. The variety of symptoms and diagnoses requires an interdisciplinary and interprofessional cooperation among general practitioners, medical specialists, psychotherapists and healthcare providers offering medical remedies.
Following a SARS-CoV-2 infection during the first wave of the COVID-19 pandemic in Germany, 8.4 [COV-OUT, 95%CI (7.7, 9.1)] respectively 25.5 [COV-IN, 95%CI (23.6, 27.4)] percentage points more subjects showed at least one new diagnosis/ symptom/treatment compared to their matched CG (COV-OUT: 44.9%, CG: 36.5%; COV-IN: 72.0%, CG: 46.5%). The results reinforce evidence of an increased burden on the health care system from increased utilization due to post-acute sequelae after COVID-19 (40,41). Further surveys should evaluate efficient, coordinated care pathways with consideration of patient-related outcomes such as quality of life, social participation, health care system resources, and economics (42).

Data availability statement
The data analyzed in this study is subject to the following licenses/restrictions: Unfortunately, the data cannot be released. Since these data are not aggregated, but are available at the insured person level and are included in the analyses at this level, it is not possible to pass them on in terms of data protection. Requests to access these datasets should be directed to Doreen. Mueller@wido.bv.aok.de.

Ethics statement
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.