Mental Health Symptoms Unexpectedly Increased in Students Aged 11–19 Years During the 3.5 Years After the 2016 Fort McMurray Wildfire: Findings From 9,376 Survey Responses

In Fort McMurray, Alberta, Canada, the wildfire of May 2016 forced the population of 88,000 to rapidly evacuate in a traumatic and chaotic manner. Ten percentage of the homes in the city were destroyed, and many more structures were damaged. Since youth are particularly vulnerable to negative effects of natural disasters, we examined possible long-term psychological impacts. To assess this, we partnered with Fort McMurray Public and Catholic Schools, who surveyed Grade 7–12 students (aged 11–19) in November 2017, 2018, and 2019—i.e., at 1.5, 2.5, and 3.5 years after the wildfire. The survey included validated measurement scales for post-traumatic stress disorder (PTSD), depression, anxiety, drug use, alcohol use, tobacco use, quality of life, self-esteem, and resilience. Data analysis was done on large-scale anonymous surveys including 3,070 samples in 2017; 3,265 samples in 2018; and 3,041 samples in 2019. The results were unexpected and showed that all mental health symptoms increased from 2017 to 2019, with the exception of tobacco use. Consistent with this pattern, self-esteem and quality of life scores decreased. Resilience scores did not change significantly. Thus, mental health measures worsened, in contrast to our initial hypothesis that they would improve over time. Of note, we observed higher levels of mental health distress among older students, in females compared to male students, and in individuals with a minority gender identity, including transgender and gender-non-conforming individuals. These findings demonstrate that deleterious mental health effects can persist in youth for years following a wildfire disaster. This highlights the need for multi-year mental health support programs for youth in post-disaster situations. The indication that multi-year, post-disaster support is warranted is relatively novel, although not unknown. There is a need to systematically investigate factors associated with youth recovery following a wildfire disaster, as well as efficacy of psychosocial strategies during later phases of disaster recovery relative to early post-disaster interventions.


INTRODUCTION
In May 2016, a large wildfire affected Fort McMurray, Alberta, Canada and the surrounding area. Called "The Beast" in the popular media (1), the wildfire necessitated the evacuation of the entire city on May 3, 2016, and over 88,000 residents were displaced for several weeks. In addition to damaging community structures and infrastructure, the fire destroyed 10% of the homes in Fort McMurray, leaving many individuals homeless. The fire burned 590,000 hectares of land before it was brought under control on July 4, 2016. Insurance costs for the damages were estimated at $3.6 billion by the Insurance Bureau of Canada, which made the Fort McMurray wildfire the most expensive insured catastrophe in Canadian history (2). Many individuals were left jobless due to damage and closure of local businesses. Social, emotional, and psychological difficulties also affected the community, as is typical after severe disasters (3,4).
Disasters tend to impact children and youth in particular (5-10) given developmental vulnerabilities (cognitive, emotional, social, and physiological) associated with childhood and adolescence, including the need to rely on others for support (11). Our group has previously examined Fort McMurray school mental health surveys completed by Grade 7-12 students in November 2017 (12,13). Brown et al. (12) reported elevated mental health symptoms for PTSD and depression compared to a control population from the same province which had not experienced a natural disaster. As reported in (13), individuals who were more personally impacted by the 2016 wildfire, such as having their home destroyed, exhibited greater symptoms of PTSD, depression, anxiety, and alcohol and/or substance misuse. These findings were consistent with previous findings of altered mental health in youth following natural disasters (9,14).
Longitudinal studies, most of which have focused on adults, have reported long-term negative impacts on mental health from natural disasters (17,25,(41)(42)(43)(44)(45)(46). The pattern of longterm effects can be complex, with individuals exhibiting better recovery from some symptoms, such as depression or anxiety, but lesser recovery for others such as PTSD symptoms (47). Relatively few longitudinal studies have focused on mental health outcomes among youth following natural disaster, although studies of children and youth following Hurricane Katrina have shown long term impacts on mental health (48), with varying recovery trajectories for different groups (49,50). We are aware of only three population studies that examined wildfire impacts on youth mental health, including one study focusing on PTSD and depression (19) and our previous Fort McMurray studies (12,13).
In the present study, our hypothesis was that mental health symptoms among Fort McMurray youth would improve with time following the November 2017 survey. This hypothesis was based on theoretical work on trauma recovery, which consistently emphasises the role of time in the recovery process (4,51,52). To test this hypothesis, we examined mental health survey data collected by Fort McMurray school boards in Grades 7-12 in November across three consecutive years, including 2017, 2018, and 2019. This repeated testing was conducted 1.5, 2.5, and 3.5 years after the 2016 wildfire (All data were collected prior to the COVID-19 pandemic). We hypothesised specifically that symptoms of PTSD, depression, anxiety, and alcohol/substance use would steadily improve from 2017 to 2018 to 2019.

Overview and Ethical Considerations
Information collected from students included questions on demographics, mental health, resilience, and personal exposure to and direct impacts of the wildfire. Measurement instruments were selected by the school systems, informed by the relevant literature and advice from the University of Alberta research team. Written letters were sent to parents and guardians to inform them of the survey 2 weeks prior. Parents and guardians could have their child(ren) not participate in the survey, and students also independently had the option to participate or not in the survey as explained at the start of each survey session (see details below and Appendix A: Survey Description Script in the Supplementary Material). Survey data collection was intentionally anonymous, and participants were not asked for their names nor any other identifying information. The study design was approved by the University of Alberta's Health Research Ethics Board (ethics protocol number Pro00072669 approved June 26, 2017). This paper reports on findings from the anonymous survey data collected from both school boards.
Mental health surveys were conducted by the two school boards in Fort McMurray-Fort McMurray Public Schools and Fort McMurray Catholic Schools (henceforth, "Schools"). They asked all students in Grades 7-12 to complete the surveys in November 2017 (18 months after the 2016 wildfire), November 2018 (30 months post-wildfire), and November 2019 (42 months post-wildfire) (All data were collected before the COVID-19 pandemic). The survey was administered during regular class time as part of the standard curriculum to evaluate the support programs the Schools had put in place following the wildfire (see Appendix B: Mental Health Support Programs, in the Supplementary Material). The Schools determined that surveys would be done in November 2017, 2018, and 2019 as the month of November worked best given various logistical and staff capacity considerations.
Fort McMurray Public Schools and Fort McMurray Catholic Schools ("Schools") administered all aspects of survey data collection, including participant consent, in accordance with their standard procedures and policies. The Schools asked researchers from the University of Alberta for assistance in designing the survey and analysing the anonymous dataset.

Survey Questionnaires
The survey included 10 questionnaires (see Table 1 for additional details): 1. Demographics Questionnaire (Demographics, 7 questions)a custom-designed questionnaire assessing age, gender, and the student's grade and school. 2. Impact of Fire Questionnaire (IOF, 6 questions)-a customdesigned questionnaire to assess the impact of the 2016 wildfire on the student. 3. Child PTSD Symptom Scale (CPSS, 19 questions)-used to assess symptoms of post-traumatic stress disorder (PTSD) (53); total CPSS score ranges from 0 to 51. 4. Patient Health Questionnaire, Adolescent version (PHQ-A, 11 questions)-used to assess symptoms of depression and suicidality (54,55); total PHQ-A score ranges from 0 to 27. 5. Hospital Anxiety and Depression Scale (HADS, 7 questions, anxiety-related questions only)-used to assess symptoms of anxiety (56); total HADS score ranges from 0 to 21.

Survey Administration Procedure
The vast majority (>98%) of students who completed the survey did so during regular school hours. A few students citing special circumstances completed the survey from home on their own computers. Depending on their school, students either completed the survey using a desktop computer in a computer laboratory or used laptops brought to their classroom. The survey website was based on an HTML/CSS front end and a back end server written in the Clojure programming language (http://clojure.org). A survey description script was read to each class at the beginning of the survey session (reproduced in the Supplementary Material). The script explained the purpose of the survey and provided instructions for completing the survey. It also explained that the survey was anonymous (participants were not asked for their names nor date of birth) and that participation was voluntary. Students had an opportunity to ask questions before participating. The survey battery included 96 questions in total. Participation required <20 min for most students, but a small number of students took up to 50 min. Participants were able to skip questions, but the survey description script and the survey website did encourage them to answer all questions.   Have you ever ridden in a CAR driven by someone (including yourself) who was "high" or had been using alcohol or drugs?

Yes, No
Questions 5-9 asked only if "yes" to one or more of questions 1-3  to the next. We therefore employed a between-subject analysis (treating all survey samples as independent) rather than using a repeated-measures, within-subject analysis over successive years. It is important to note that adopting a between-subject approach is statistically conservative as there is an expectation of increased overall error variance compared to the likely advantage of withinsubject analysis.

Cut-Off Scores and Probable Diagnoses
Probable diagnoses of four different psychiatric conditions were established by thresholding each participant's scores on specific scales. Threshold values for probable diagnoses were derived from the relevant literature for each scale, as described below. Specific probable diagnoses included PTSD (based on the CPSS scale), depression (from the PHQ-A), anxiety (from the HADS), and alcohol/substance use disorder (based on the CRAFFT). The term "probable diagnosis" is used here, as opposed to "clinical diagnosis, " because the scores were based on self-report scales rather than psychiatric clinical interviews. The literature reports good agreement between psychiatric clinical diagnoses of PTSD, depression, anxiety, and alcohol/substance use disorder with probable diagnoses derived from widely-published threshold scores for the above four questionnaires (57,58,(62)(63)(64)(65), and we have previously used this approach (66).
Probable PTSD was determined based on a CPSS score of 15 or more (65,67). Probable depression was determined based on a PHQ-A score of 11 or more (63). Probable moderately severe depression was determined based on a PHQ-A score of 15 or more (62). Suicidal thinking was determined from responses to two questions from the PHQ-A: question 9 "Over the past 2 weeks, how often have you been bothered by any of the following problems: Thoughts that you would be better off dead, or of hurting yourself in some way?" and question 10 "Has there been a time in the past month when you have had serious thoughts about ending your life?" Participants were assessed as exhibiting suicidal thinking if they answered "Several days, " "More than half the days, " or "Nearly every day" to PHQ-A question 9 and "Yes" to question 10. Participants answering "Not at all" to question 9 skipped (were not shown) question 10, and they were assessed as not exhibiting suicidal thinking. In addition, participants answering "Several days, " "More than half the days, " or "Nearly every day" to PHQ-A question 9 and "No" to question 10 were assessed as not exhibiting suicidal thinking (as distinct from thinking about self-harm) (PHQ-A question 11 was not considered in the definition of suicidal thinking). Probable anxiety was determined based on a HADS score of 11 or more (64). Probable alcohol/substance use disorder was determined based on a CRAFFT score of 2 or more (57,58). Tobacco use was determined as answering "yes" to either of the two questions on the Tobacco Use Questionnaire. Finally, an "Any of 4 probable diagnoses" criterion was defined as being positive for one or more of the four probable diagnoses: PTSD, depression, anxiety, or alcohol/substance use disorder.
For each of the 15 dependent measures, we tested five statistical effects: (1) linear effect of time (2017 vs. 2018 vs. 2019), (2) linear effect of age (11 to 19 years old), (3) effect of female vs. male gender identity, (4) effect of other vs. female/male gender identity, and (5) effect of preferred not to say vs. female/male gender identity.
Participant gender identity was determined based on Demographics Questionnaire question 3 "What gender do you identify with?, " with answer choices "female, " "male, " "other, " and "prefer not to say." Statistical test 3 (female vs. male) compared participants answering "female" vs. those answering "male." Test 4 (other vs. female/male) compared participants answering "other" vs. those answering either "female" or "male." Test 5 (preferred not to say vs. female/male) compared participants answering "prefer not to say" vs. those answering either "female" or "male." Each test of a gender effect included only those participants with the relevant gender identities (test 3: female and male; test 4: female, male, and other; test 5: female, male, and preferred not to say).

Details of Statistical Analysis
All statistical comparisons were done using permutation testing on the slope parameter from a fitted linear model, with a null hypothesis of zero slope. The linear model included a slope parameter for the effect variable (time, age, female vs. male, other vs. female/male, preferred not to say vs. female/male) as well as parameters for "nuisance variables" as described below. Mathematical details of the linear modelling procedure are included in the "Details of Linear Modelling" section below.
Permutation testing is a non-parametric method and was chosen for its robustness against non-normality. The number of iterations was 10 5 for all permutation tests. All tests were two-tailed (That is, to compute the p-value for a given test, the absolute value of the slope parameter fitted to the real data was compared against the absolute values of the 10 5 simulated slope parameters fitted to permuted data).
In total, our analysis of the five statistical effects for each of 15 dependent variables included 75 individual statistical tests. We addressed multiple comparisons using the Benjamini-Hochberg method for false discovery rate (FDR) correction. This method computed a threshold of p = 0.025 for FDR correction across all 75 tests.
Distributions of gender identities and ages were similar across time, and the distribution of gender identities was similar across different ages (see "Demographics" in the Results section). Nonetheless, to address the possibility that results for one effect might have been driven in part by some small difference in one or more of the other effect variables, we included "nuisance variables" in the linear models to which permutation testing was applied. To test effects of time (2017 vs. 2018 vs. 2019), we used a linear model with a term for time as well as nuisance variables including a covariate for age and four indicator variables for gender identity: female, male, other, and preferred not to say. For analyses on the effects of time, only the fitted slope parameter for time was used to generate p-values. Including the other nuisance variables allowed the model to separate out effects of age and gender from effects of time. Similarly, analyses of age used five nuisance variables, including a time covariate and four indicator variables for gender identities. Analyses of gender effects (female vs. male, other vs. female/male, preferred not to say vs. female/male) included time and age as nuisance variables.
As expected from a sample of students in Grade 7-12, there were substantially fewer participants who were 11 or 18-19 years old, compared to those who were 12-17 years old, at the time of data collection (see "Demographics" in the Results section). To address the possibility that results for effects of age might be driven by leverage effects from smaller sample numbers in the extremes (aged 11 or 18-19 years old), we ran a follow-up analysis which included only participants aged 12-17. We performed all analyses using in-house computer code written in the Clojure programming language (http://clojure. org). The code for statistical testing and FDR correction is available at http://github.com/mbrown/mrgbstats.

Details of Linear Modelling
For a given analysis, we defined an effect variable x i and a dependent variable y i , as well as J covariate nuisance variables v j,i and K indicator nuisance variables w k,i , where i ∈ [1, N] with N being the number of surveys used in the analysis.
For analyses on effects of time or effects of age, the effect variable x i was mean-centred to make it "independent" (orthogonal) to the intercept (i.e., constant offset). For analyses on effects of gender (female vs. male, other vs. female/male, preferred not to say vs. female/male), the effect variable was categorical and therefore not mean-centred. Covariate nuisance variables v j,i were mean-centred, and indicator nuisance variables w k,i were not mean-centred.
For analyses on effects of time or effects of age, we did not include an intercept term in the model (i.e., a constant offset column containing all ones in the model matrix). Because analyses of time and age included four nuisance indicator variables for gender identities, an intercept column would have been a linear combination of those four indicator variables, rendering the model matrix degenerate. The effect variable for time or for age was mean-centred and therefore orthogonal to the constant offset, in any case. For analyses of effects of gender, which did not have the four nuisance variables for gender, we did include a constant offset term.
For each analysis, we created a model matrix X from the effect variable x i , J covariate nuisance variables v j,i , and K indicator nuisance variables w k,i , as well as a constant offset term for analyses of gender effects. Each survey i contributed one effect variable value, one dependent variable value, and one value for each of (J + K) nuisance variables. For example, for analyses on effects of time, x i was time; v 1,i was age; w 1,i , w 2,i , w 3,i , and w 4,i were indicator variables for gender identity, including female, male, other, and preferred not to say; and there was no constant offset column. To take a second example, for analyses of other vs. female/male gender identity, x i was 0 for participants identifying as female or male and 1 for those with other gender identity; v 1,i and v 2,i were time and age, respectively; there were no indicator nuisance variables; and there was a constant offset column. We created a model matrix X with size N by (1 + J + K) or else N by (2 + J + K), for analyses without and with a constant offset column, respectively. The first column of X consisted of the effect variable values x i . If a constant offset column was included, it was the second column. The next J columns were comprised of the mean-centred covariate nuisance variables v j,i , and the last K columns were comprised of the indicator nuisance variables w k,i .
We defined the vectorβ = (X T X) −1 X T ⇀ y where ⇀ ywas the vector of dependent variable values y i . The slope parameter used for permutation testing wasβ 1 , the first element ofβ, representing the fitted scaling parameter for the effect variable.

RESULTS
The survey was administered to all Grade 7-12 students in Fort McMurray, Alberta, Canada who were attending either the Public or Catholic schools on the days the survey was conducted during the month of November in 2017, 2018, and 2019. Five Public schools and two Catholic schools participated in the survey. In total, 9,920 surveys were collected during the period of 2017 to 2019. Forty five percentage were collected from Public schools and 55% from Catholic schools. As all surveys were anonymous, there was no way to identify which students did or did not repeat the survey over successive years.

Data Exclusion
Data from 544 surveys were excluded based the following exclusion criteria: Criteria 1 and 2 above excluded participants with ambiguous age. The remaining participants had non-ambiguous ages in the range 11-19 years, allowing us to model effects of age as a linear variable. Criteria 3 to 5 above excluded participants who gave inconsistent answers, possibly because they were not paying attention to the survey or did not understand the questions. After exclusions, the final dataset included 9,376 surveys.

Demographics
Demographics for the 9,376 surveys were as follows. Selfreported gender identity was 47.3% female, 48.6% male, 1.7% other, and 2.4% preferred not to say. Age ranged from 11 to 19, and the mean age of participants was 14.3 years ± 1.8 (standard deviation). For additional demographic details, see Table 2. Distributions of gender identities and ages were similar across the 3 years of data collection ( Table 2). The distribution of gender identities was similar across different ages as well ( Table 2).  Effects of Age (11)(12)(13)(14)(15)(16)(17)(18)(19) Years Old) Scores for PTSD (CPSS), depression (PHQ-A), anxiety (HADS), and alcohol/substance use (CRAFFT) increased with age. Selfesteem (Rosenberg), quality of life (Kidscreen), and resilience (CYRM-12) scores decreased with age. Rates of probable diagnoses of PTSD, depression, moderately severe depression, suicidal thinking, anxiety, alcohol/substance use disorder, tobacco use, and the "Any of 4 probable diagnoses" category all increased with age. We did follow-up analyses of age restricted to participants aged 12-17 years old, with time and gender partialled out. Suicidal thinking did not change significantly over time with participants aged 12-17 years (p = 0.11), in contrast to analysis of age 11-19 years, which found a statistically significant increase in the rate of suicidal thinking with age (p = 0.0088). The other 14 dependent measures exhibited the same pattern of statistically significant changes in analyses with age 12-17 years (CYRM-12 p = 0.00038, CPSS p = 0.00002, all other tests p = 0.00001) as compared to analyses with age 11-19 years.

Effects of Gender Identity
Tables 5-7 present results of three analyses of gender identity. Gender identity was determined based on Demographics Questionnaire question 3 "What gender do you identify with?, " with answer choices "female, " "male, " "other, " and "prefer not to say." Analyses of gender identity compared female vs. male (Table 5), other vs. female/male ( Table 6), and preferred not to say vs. female/male ( Table 7) (See "Dependent Variables and Statistical Effects Tested" in the Materials and Methods section for additional details). Analyses of gender effects were done using permutation testing, with time and age partialled. All 15 dependent measures showed significant differences for all three gender identity comparisons, with statistical significance surviving FDR multiple comparison correction for all tests. Scores for PTSD (CPSS), depression (PHQ-A), anxiety (HADS), and alcohol/substance use (CRAFFT) were higher in females vs. males, in those with other gender identity vs. females/males, and in those who preferred not to say vs. females/males. Selfesteem (Rosenberg), quality of life (Kidscreen), and resilience (CYRM-12) scores were lower in females, in participants with other gender identity, and in participants who preferred not to say. Rates of probable diagnoses of PTSD, depression, moderately severe depression, suicidal thinking, anxiety, alcohol/ substance use disorder, tobacco use, and the "Any of 4 probable diagnoses" category were higher in females, in those with other gender identity, and in those who preferred not to say.

DISCUSSION
This study investigated the multi-year impacts of wildfires on youth mental health. We As we have previously reported, the Fort McMurray student population exhibited elevated rates of probable depression, suicidal thinking, and tobacco use; elevated symptoms of anxiety; and reduced scores for quality of life and selfesteem 18 months after the 2016 wildfire, as compared to a control population that had not recently experienced a natural disaster (12). At that time, we observed similar mental health patterns for youth who were not actually present in Fort McMurray during the 2016 wildfire, although youth with greater personal exposure to impacts of the fire (e.g., home destroyed) exhibited worse symptoms of PTSD, depression, anxiety, and alcohol/substance use and lower scores for self-esteem and quality of life (13). The current study provides evidence of ongoing long-term mental health impacts on youth 3.5 years (42 months) post-wildfire.

Longer Term Mental Health Impacts
The results from the present study indicate a slow but statistically significant trend of worsening mental health from 2017 to 2019, including increased symptom scores and increased rates of probable diagnoses of PTSD, depression, anxiety, drug use, and alcohol use. Quality of life and self-esteem scores also decreased from 2017 to 2019. Tobacco use and resilience scores did not change significantly. These findings are consistent with other studies reporting long-term negative impacts on mental health from natural disasters (41)(42)(43)(44)(45), although these studies are not    directly comparable because they focused on mental health in adulthood and/or used different outcome measures. Our results do not support our hypothesis that mental health would improve with time following the 2016 wildfire. One possibility is that 3.5 years may not be sufficient time for this population to recover from the adverse mental health effects of the wildfire, though that possibility seems unlikely in light of reports that many individuals affected by disaster do recover within 1-2 years (4). Theories of recovery from trauma identify various factors important to the recovery process including a sense of safety and stability, self-and community efficacy, hope, support from family and friends, social support, and social connectedness (4,51,52). It is possible that one or more factors important to recovery may be at issue. For example, Fort McMurray has experienced an economic downturn related to the reduction in oil prices starting in 2014, and this may negatively affect the community's sense of stability and hope. Future studies would be needed to test this suggestion.
It is noteworthy that CYRM-12 resilience measures did not change significantly with time (p = 0.27), indicating that the above-discussed changes in mental health measures may not be attributable to a change in resilience.
There was a statistically significant increase in suicidal thinking with age in the analysis of participants aged 11-19 (p = 0.0088) but not in the analysis of participants aged 12-17 (p = 0.11). The difference in results seems to be driven by the larger proportion of students aged 18 and 19 exhibiting suicidal thinking (18 years: 23%, 19 years: 28%) compared to younger students (14-19%) (see Table 4).

Age-Related Differences in Mental Health Impact
We observed worse average scores on all 15 dependent measures in older vs. younger students. Specifically, older students exhibited higher mental health symptom scores, higher rates of probable diagnoses, and lower scores for self-esteem, quality of life, and resilience. These results are consistent with previous reports of increased mental health impairment among older youth compared to younger youth post-disaster (68), as well as higher rates of mental health symptoms in older adolescents more generally (69). As has been suggested by others (68), one possible interpretation of this finding, that would bear future exploration, is that greater awareness among older youth regarding challenges facing their families and the larger community (rebuilding, economic implications, etc.), as well as concerns they may have regarding their future, may negatively influence their well-being compared to younger youth. That older students also exhibited higher mental health symptom scores and lower resilience scores is consistent previous research showing an association between lesser resilience and worse mental health outcomes following disaster (28,35,70,71) and with theoretical conceptions of resilience and its role in buffering the individual's mental health from harm due to adverse experiences (72)(73)(74). Reduced resilience may have played a role in making older students more vulnerable to developing negative mental health symptoms.

Differences in Mental Health Related to Gender Identity
Our analyses revealed worse average scores on all 15 dependent measures in students identifying as female vs. male, in students with other gender identity vs. females/males, and students who preferred not to say their gender identity vs. females/males. These specific groups of students exhibited higher mental health symptom scores, higher rates of probable diagnoses, and lower scores for self-esteem, quality of life, and resilience.
We interpret participants answering "other" to the question of their gender identity as belonging to a gender minority, including transgender or gender non-conforming. For participants who preferred not to answer the demographics question on gender identity, some presumably identified as female or male but did not want to say so, while others presumably identified as a gender minority, including transgender or gender-nonconforming. Given that the group who preferred not to answer exhibited significantly worse mental health results than those identifying as female or male, we suspect that a majority of the group who preferred not to answer in fact belonged to a gender minority, including transgender or gender-non-conforming.
The finding of worse mental health scores in females compared to males is consistent with a previous report of higher rates of mental health symptoms post-natural disaster in females vs. males (75). We are not aware of any previous studies examining the impact of natural disaster on the mental health of gender minorities, including transgender and gender non-conforming individuals. It is a benefit of our population study approach, with its large sample size, that we are able to do so for the first time. More generally, our results are consistent with previous reports indicating higher rates of mental health symptoms in gender minorities, including transgender and gender-non-conforming individuals, both adults and youth (76)(77)(78)(79). Resilience scores were lower for females vs. males and for gender minorities vs. females/males, which is consistent with previous studies showing an association between lesser resilience and worse mental health outcomes following disaster (28,35,70,71). This suggests that reduced resilience may have been a factor in specific groups' developing more negative mental health symptoms.

Implications
The destructive nature of disasters tends to attract funding from government and charities to address the immediate aftermath of a disaster. Our results provide an example of worsening mental health impacts in youth during the period 1.5-3.5 years following a wildfire disaster. This has occurred in the context of an ongoing, whole-of-community "build back better" approach to post-wildfire recovery in the area (80), as well as multiple challenges that have faced the community since the 2016 wildfire, including a downturn in the economy (Additional challenges include disastrous flooding following the 2020 spring ice breakup and the outbreak of the COVID-19 pandemic in 2020, though these occurred after the last survey data collection was completed in 2019). Our results underscore the need for multi-year funding, interventions, and policies to address not only the short-term physical damage but also the long-term negative mental health effects of natural disasters.
In addition to focusing on symptomatology, there is a need to investigate factors associated with post-disaster recovery processes, as well as efficacy of psychosocial strategies during later phases of recovery relative to early interventions, while also recognising complex and evolving, social-ecological postdisaster contexts. As an example, given that specific groups exhibited greater negative mental health effects, namely older youth, females, and gender minorities, interventions and policies may be more effective if they take into account developmental stage and gender identity with respect to mitigating these effects.

Limitations
Our analysis included a large dataset of 9,376 survey responses. Conducting full clinical interviews with this large number of participants is not feasible, and so we used clinical measures based on self-report questionnaires, which is a limitation of the study. As noted above, in addition to the 2016 wildfire, Fort McMurray has experienced an economic downturn since 2014. The resulting job losses and financial impacts on families also likely had an effect on youth mental health in addition to the 2016 wildfire. We are not aware of any study specifically on the effects of Alberta's economic downturn on youth mental health, though one report found a negative mental health impact in adults (81). Additionally, given that this study utilised anonymous data, we were not able to identify longitudinal trends in individuals, only in groups.

CONCLUSION
This study presents a cross-sectional statistical analysis of longitudinal mental health measurements in a population of Fort McMurray youth in Grades 7-12 during the period of 1.5-3.5 years following the 2016 Alberta wildfire. Findings indicate that there was a long-term trend of worsening mental health during that period. These observations support previous reports that youth and communities experience long-term mental health impacts following major natural disasters, such as wildfires. Our findings emphasise the need for multi-year funding and programs to support child and youth mental health in communities that have experienced such disasters.

AUTHOR'S NOTE
Some portions of the manuscript, in the Materials and Methods section, are based on similar material from two previous papers (12,13) presenting findings from separate analyses of the data from Fort McMurray collected in 2017 only.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Requests for data access should be made by email to the corresponding author.

ETHICS STATEMENT
This study involving human participants was reviewed and approved by the University of Alberta's Health Research Ethics Board (ethics protocol number Pro00072669 approved June 26, 2017). Data were collected as part of the Fort McMurray Public and Catholic Schools' standard curriculum to evaluate their support programs. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study, in accordance with the national legislation and the institutional requirements. Parents/guardians were notified of the study 2 weeks prior and were given the opportunity to opt their child(ren) out of the study. Participants themselves were given the opportunity to opt out of the study, as was explained to them at the beginning of each study survey session.