<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="brief-report">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Ecol. Evol.</journal-id>
<journal-title>Frontiers in Ecology and Evolution</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Ecol. Evol.</abbrev-journal-title>
<issn pub-type="epub">2296-701X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fevo.2022.942766</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Ecology and Evolution</subject>
<subj-group>
<subject>Brief Research Report</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Ensemble modeling for American chestnut distribution: Locating potential restoration sites in Pennsylvania</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Henderson</surname> <given-names>Alec F.</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1813371/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Santoro</surname> <given-names>Jennifer A.</given-names></name>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1864566/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kremer</surname> <given-names>Peleg</given-names></name>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/561332/overview"/>
</contrib>
</contrib-group>
<aff><institution>Department of Geography and the Environment, Villanova University</institution>, <addr-line>Villanova, PA</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Michael Sears, Clemson University, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Andrew E. Newhouse, SUNY College of Environmental Science and Forestry, United States; Hill Craddock, University of Tennessee at Chattanooga, United States</p></fn>
<corresp id="c001">&#x002A;Correspondence: Alec F. Henderson, <email>ahende15@villanova.edu</email></corresp>
<fn fn-type="other" id="fn002"><p><sup>&#x2020;</sup>ORCID: Jennifer A. Santoro, <ext-link ext-link-type="uri" xlink:href="https://orcid.org/0000-0002-7043-7211">orcid.org/0000-0002-7043-7211</ext-link>; Peleg Kremer, <ext-link ext-link-type="uri" xlink:href="https://orcid.org/0000-0001-6844-5557">orcid.org/0000-0001-6844-5557</ext-link></p></fn>
<fn fn-type="other" id="fn004"><p>This article was submitted to Models in Ecology and Evolution, a section of the journal Frontiers in Ecology and Evolution</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>10</day>
<month>08</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>10</volume>
<elocation-id>942766</elocation-id>
<history>
<date date-type="received">
<day>13</day>
<month>05</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>25</day>
<month>07</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Henderson, Santoro and Kremer.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Henderson, Santoro and Kremer</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>The American chestnut (<italic>Castanea dentata</italic> Borkh.) was an economically, ecologically, and culturally important tree in eastern American hardwood forests. However, the American chestnut is currently functionally absent from these forests due to the introduction of an invasive fungus (<italic>Cryphonectria parasitica</italic> (Murr.) Barr) and causal agent of chestnut blight in the early 1900s. Field experiments are being carried out to develop a blight-resistant American chestnut tree, but range-wide restoration will require localized understanding of its current distribution and what factors contribute to suitable American chestnut habitat. While previous studies have researched species distribution of the American chestnut, it is important to understand how species distribution modeling (SDM) technique impacts model results. In this paper we create an ensemble model that combines multiple different SDM techniques to predict areas of suitable American chestnut habitat in Pennsylvania. Results indicate that model accuracy varied considerably by SDM technique &#x2013; with artificial neural networks performing the worst (Area-Under-the-Curve, AUC = 0.705) and gradient boosting models performing the best (AUC = 0.877). Even though SDM technique accuracy varied, most models identified the same environmental variables as the most important: ratio of sand to clay in the soil, canopy cover, topographic convergence index, and topographic position index. This study offers insight into the best SDM techniques to use, as well as a method of combining SDMs for higher prediction confidence.</p>
</abstract>
<kwd-group>
<kwd>American chestnut</kwd>
<kwd>species distribution models</kwd>
<kwd>ensemble modeling</kwd>
<kwd>suitable habitat</kwd>
<kwd>restoration</kwd>
</kwd-group>
<counts>
<fig-count count="1"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="37"/>
<page-count count="9"/>
<word-count count="6038"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>Introduction</title>
<sec id="S1.SS1">
<title>American chestnut background</title>
<p>Until the beginning of the 20th century, the American chestnut (<italic>Castanea dentata</italic> Borkh.) was a hallmark tree of eastern American hardwood forests, ranging from Ontario to Alabama and spanning from the Atlantic coast to Illinois (<xref ref-type="bibr" rid="B28">Russell, 1987</xref>; <xref ref-type="bibr" rid="B5">Collins et al., 2017</xref>). Throughout this range, <italic>C. dentata</italic> had crucial ecological, economic, and cultural importance&#x2013;providing a valuable nut crop for wildlife (<xref ref-type="bibr" rid="B6">Diamond et al., 2000</xref>), rot resistant and durable timber for manufacturing (<xref ref-type="bibr" rid="B16">MacDonald et al., 1978</xref>), and properties enabling the ways of life of Native Americans and Appalachian communities (<xref ref-type="bibr" rid="B29">Steiner and Carlson, 2006</xref>). In 1904, an invasive chestnut blight, cased by the ascomycete fungus <italic>Cryphonectria parasitica</italic> (Murr.) Barr, was discovered on <italic>C. dentata</italic> trees (<xref ref-type="bibr" rid="B27">Rigling and Prospero, 2018</xref>). This fungus was unintentionally introduced to eastern American forests prior to 1904, probably on nursery stock from Japan (<xref ref-type="bibr" rid="B18">Milgroom et al., 1996</xref>; <xref ref-type="bibr" rid="B7">Dutech et al., 2012</xref>; <xref ref-type="bibr" rid="B27">Rigling and Prospero, 2018</xref>) and the blight spread rapidly, functionally extirpating <italic>C. dentata</italic> from the overstory in just 50 years (<xref ref-type="bibr" rid="B21">Paillet, 2002</xref>).</p>
<p>Since the loss of <italic>C. dentata</italic> from the forest overstory, considerable efforts have been made to introduce genes for blight resistance <italic>via</italic> introgression from the Asian species of <italic>Castanea</italic> into locally adapted populations of <italic>C. dentata</italic> throughout its native range. The American Chestnut Foundation (TACF) is piloting per-state chestnut backcross breeding programs to develop a hybrid American chestnut tree with <italic>C. dentata</italic> traits but blight resistance from other chestnut species, including <italic>C. crenata, C. henryi</italic>, and <italic>C. mollissima</italic>. Researchers at SUNY-ESF have independently been developing a transgenic method enhancing blight resistance in <italic>C. dentata</italic> (<xref ref-type="bibr" rid="B30">Steiner et al., 2017</xref>); recently, both TACF and SUNY ESF have converged these methods into one united approach (<xref ref-type="bibr" rid="B37">Westbrook et al., 2019</xref>). Both methodologies seem promising, but both require finding genetic material from surviving mature <italic>C. dentata</italic> trees. Field breeding methods are important, but it is also crucial to understand <italic>C. dentata</italic> habitat preferences to determine where to plant blight-resistant trees in the future.</p>
</sec>
<sec id="S1.SS2">
<title>Species distribution models</title>
<p>Species distribution models (SDMs) are useful tools to predict probable areas of species presence as well as areas of suitable habitat for a given species and can contribute valuable knowledge on species extent across a landscape (<xref ref-type="bibr" rid="B8">Elith and Leathwick, 2009</xref>). By using SDMs over large areas, managers can efficiently isolate the most ideal locations for <italic>C. dentata</italic> habitat before using boots-on-the-ground approaches such as soil samples to choose the best sites for restoration. SDMs use layers of environmental data, species occurrence points, and species absence points to generate statistics and predictions of species distribution (<xref ref-type="bibr" rid="B14">Franklin, 2010</xref>). Environmental variables used for SDM are often layers describing the topography, land cover, climate, or soil attributes of the region that may impact suitable habitat. For example, historical accounts of <italic>C. dentata</italic> in Pennsylvania suggest that chestnut was typically found on sandy soils and ridge topography, which describe important environmental layers to include in SDMs (<xref ref-type="bibr" rid="B20">Nowacki and Abrams, 1992</xref>).</p>
<p>Species distribution modelings have been used to model habitat distribution for a variety of tree species in order to inform land management strategies in the face of climate change (<xref ref-type="bibr" rid="B4">Booth, 2018</xref>). <xref ref-type="bibr" rid="B17">Matthews et al. (2011)</xref> examined 134 tree species responses to climate change using SDMs and found that species life history characteristics played a role in range shifts. Previous research has also used SDMs to explore <italic>C. dentata</italic> distribution at various spatial scales and extents across its range. Full-range studies found that temperature, precipitation, and soil factors influenced <italic>C. dentata</italic> distribution (<xref ref-type="bibr" rid="B2">Barnes and Delborne, 2019</xref>; <xref ref-type="bibr" rid="B19">Noah et al., 2021</xref>). Finer-scale SDMs for individual states or sub-regions often identified soil and topographic variables as most influential. <xref ref-type="bibr" rid="B13">Fei et al. (2007)</xref> modeled habitat of American chestnut in Mammoth Cave National Park using ecological niche factor analysis and found that ridges and steeper slopes were strong predictors of chestnut habitat. Tulowiecki modeled the range of American chestnut trees in western New York using historical tree records and nine different SDM techniques [artificial neural networks (stocktickerANN), classification tree analysis (CTA), flexible discriminant analysis (FDA), generalized additive models (GAM), gradient boosting models (GBM), generalized linear models (stocktickerGLM), multiple adaptive regression splines (MARS), maximum entropy modeling (MaxEnt), and random forests (RF)], and found that soil pH and slope were important habitat predictors (2020). These multiple SDMs differ in their approach to modeling species distribution by whether they utilize statistical or machine-learning methods, whether they model linear and/or non-linear relationships between variables, and whether they can model interactions between variables, and comparison between approaches can enhance the reliability of results (<xref ref-type="bibr" rid="B35">Tulowiecki, 2020</xref>).</p>
<p>This study contributes to understanding of SDMs of <italic>C. dentata</italic> across the state of Pennsylvania, which is central to its former range, by identifying the strengths and weaknesses of different SDM approaches and enhancing model robustness through ensemble modeling. Results of this study can be used to find further genetic material for development of blight-resistant American chestnut trees and help identify locations for pilot restoration projects to focus their money and energy (<xref ref-type="bibr" rid="B12">Fei et al., 2012</xref>). Using an ensemble of species distribution models, following the methodology outlined in <xref ref-type="bibr" rid="B35">Tulowiecki (2020)</xref>, this study aims to better understand the distribution of <italic>C. dentata</italic> across environmental ranges in the state of Pennsylvania, identify areas most suitable for reintroduction efforts, and determine how different modeling techniques perform in modeling spatial distribution of American chestnuts in Pennsylvania.</p>
</sec>
</sec>
<sec id="S2">
<title>Methods</title>
<sec id="S2.SS1">
<title>Overview of methods</title>
<p>We used a dataset of mature <italic>C. dentata</italic> locations maintained by TACF to model species distribution across all of Pennsylvania. We utilized nine different SDM techniques in the &#x201C;Biomod2&#x201D; package from R statistical software utilizing the &#x201C;ShinyBIOMOD&#x201D; GUI (<xref ref-type="bibr" rid="B34">Thuiller et al., 2009</xref>, <xref ref-type="bibr" rid="B33">2016</xref>; <xref ref-type="bibr" rid="B26">R Core Team, 2013</xref>). All environmental variables used for modeling were generated and processed using ArcMap 10.7.1 geospatial software (<xref ref-type="bibr" rid="B10">ESRI, 2011</xref>). Determining true absence points for the whole region was not feasible due to the scale of this study across the state of Pennsylvania so pseudo-absence points for <italic>C. dentata</italic> locations were selected randomly from the entire set of potential absences within the &#x201C;Biomod2&#x201D; modeling process (<xref ref-type="bibr" rid="B24">Phillips et al., 2009</xref>). These pseudo-absence points were used as absence locations for the SDMs that required these points.</p>
</sec>
<sec id="S2.SS2">
<title>Statewide chestnut data</title>
<p>Surviving <italic>C. dentata</italic> locations were acquired from TACF&#x2019;s &#x201C;dentataBase&#x201D; of known and verified mature American chestnut tree locations (<xref ref-type="bibr" rid="B32">TACF, 2020</xref>). Most tree locations in this database have been collected by volunteers who are asked to send samples to TACF so that they can be verified by experts. These samples contain freshly cut twigs with mature leaves attached and the location of the tree. Presence points collected in this manner can contain uncertainty and sampling bias due to GPS inaccuracy and the greater likelihood of finding and marking trees near roads or trails where humans have access, but these points represent the most comprehensive dataset of verified surviving <italic>C. dentata</italic> locations and are thus the most useful input to SDMs despite their limitations. We filtered this database so that it would only include <italic>C. dentata</italic> records within Pennsylvania. Filtering the database resulted in 295 non-hybrid <italic>C. dentata</italic> points in Pennsylvania.</p>
</sec>
<sec id="S2.SS3">
<title>Generation of environmental variables</title>
<p>Ten environmental variables representing land cover, topography, and soil attributes were included in the species distribution models to predict <italic>C. dentata</italic> habitat suitability. Variables were identified to represent a range of habitat characteristics that could describe growing conditions for <italic>C. dentata</italic> based on prior knowledge of the species biology (<xref ref-type="bibr" rid="B21">Paillet, 2002</xref>; <xref ref-type="bibr" rid="B5">Collins et al., 2017</xref>) and other modeling research for chestnuts (<xref ref-type="bibr" rid="B13">Fei et al., 2007</xref>; <xref ref-type="bibr" rid="B35">Tulowiecki, 2020</xref>). These variables included canopy cover variables acquired from National Land Cover Database (NLCD) (<xref ref-type="bibr" rid="B15">Homer et al., 2012</xref>), soil composition collected from ISRIC soil grids (<xref ref-type="bibr" rid="B25">Poggio et al., 2021</xref>), a Euclidean distance to streams layer generated from an Environmental Resources Research Institute streams shapefile (<xref ref-type="bibr" rid="B9">Environmental Resources Research Institute, 1998</xref>), and seven digital elevation model-derived variables that represent a range of topographic and moisture conditions. The digital elevation model was acquired from a <xref ref-type="bibr" rid="B36">U.S. Geological Survey (2000)</xref>. The digital elevation model-derived aspect layer was transformed using a Beers transformation to change a cyclic variable to a linear one for more accurate consideration in modeling (<xref ref-type="bibr" rid="B3">Beers et al., 1966</xref>). Other variables, such as soil pH, that have been considered important for chestnut habitat in prior studies were unavailable at the spatial scale and extent for this study and thus could not be included in our models. We acknowledge that our results may be impacted due to exclusion of such layers. All variables were resampled to a resolution of 238 m to match the resolution of the soil data. <xref ref-type="table" rid="T1">Table 1</xref> shows all environmental variables used in the models, their value ranges, and their initial resolutions.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Environmental variables used in all species distribution models, the range of values present for each variable, and the initial resolution of the layer.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left"></td>
<td valign="top" align="center">Environmental variable</td>
<td valign="top" align="center">Variable range within study area</td>
<td valign="top" align="center">Variable code</td>
<td valign="top" align="center">Initial resolution (m)</td>
<td valign="top" align="center">Source</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Land cover</td>
<td valign="top" align="center">Canopy cover</td>
<td valign="top" align="center">0&#x2013;99%</td>
<td valign="top" align="center">pacanopy</td>
<td valign="top" align="center">30 m</td>
<td valign="top" align="center">NLCD</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Distance to streams</td>
<td valign="top" align="center">0&#x2013;3431.82 m</td>
<td valign="top" align="center">padts</td>
<td valign="top" align="center">238 m</td>
<td valign="top" align="center">PASDA</td>
</tr>
<tr>
<td valign="top" align="left">Topography</td>
<td valign="top" align="center">Elevation</td>
<td valign="top" align="center">1&#x2013;890 m</td>
<td valign="top" align="center">padem</td>
<td valign="top" align="center">30 m</td>
<td valign="top" align="center">PASDA</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Slope</td>
<td valign="top" align="center">0&#x2013;31.27&#x00B0;</td>
<td valign="top" align="center">paslope</td>
<td valign="top" align="center">238 m</td>
<td valign="top" align="center">PASDA</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Aspect</td>
<td valign="top" align="center">0&#x2013;2 (Beers transformation of 360&#x00B0;)</td>
<td valign="top" align="center">paaspect</td>
<td valign="top" align="center">238m</td>
<td valign="top" align="center">PASDA</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Curvature</td>
<td valign="top" align="center">&#x2212;0.51 (upwardly convex) &#x2013; 0.53 (upwardly concave)</td>
<td valign="top" align="center">pacurvature</td>
<td valign="top" align="center">238 m</td>
<td valign="top" align="center">PASDA</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Topographic convergence index</td>
<td valign="top" align="center">2.69 (low water accumulation) &#x2013; 24.97 (high water accumulation)</td>
<td valign="top" align="center">patci</td>
<td valign="top" align="center">238 m</td>
<td valign="top" align="center">PASDA</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Topographic position index</td>
<td valign="top" align="center">&#x2212;129 (valleys) &#x2013; 149 (peaks)</td>
<td valign="top" align="center">patpi</td>
<td valign="top" align="center">238 m</td>
<td valign="top" align="center">PASDA</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Topographic relative moisture index</td>
<td valign="top" align="center">14 (lower soil moisture) &#x2013; 59 (higher soil moisture)</td>
<td valign="top" align="center">patrmi</td>
<td valign="top" align="center">238 m</td>
<td valign="top" align="center">PASDA</td>
</tr>
<tr>
<td valign="top" align="left">Soil</td>
<td valign="top" align="center">Sand to clay ratio</td>
<td valign="top" align="center">0.64&#x2013;6.11</td>
<td valign="top" align="center">pasandclayratio</td>
<td valign="top" align="center">238 m</td>
<td valign="top" align="center">ISRIC</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn><p>&#x201C;Variable code&#x201D; reports the abbreviation used for the species in the species distribution model.</p></fn>
</table-wrap-foot>
</table-wrap>
<p>In preparation for SDM, environmental raster data layers were processed in ArcMap 10.7.1. All rasters were reclassified to have the same cell snaping, mask, and cell size (approximately 238 m<sup>2</sup>). We generated Elevation, Slope, Aspect, Curvature, Topographic convergence index, Topographic position index, and Topographic relative moisture index (<xref ref-type="bibr" rid="B22">Parker, 1982</xref>) layers from the Pennsylvania digital elevation model acquired from PASDA (<xref ref-type="bibr" rid="B23">Pennsylvania Spatial Data Access, 2022</xref>). We generated a Distance to streams layer by using the ArcMap Euclidian Distance tool on the streams shapefile acquired from PASDA. The species location points were generated from the TACF dentataBase and we recalculated their coordinates to match with the coordinate system of the environmental layers.</p>
</sec>
<sec id="S2.SS4">
<title>Species distribution modeling</title>
<p>We used ShinyBIOMOD, a graphical interface for the R package &#x201C;biomod2,&#x201D; to streamline the SDM process. We defined a geographic region of the state of Pennsylvania and uploaded the species occurrence data and environmental data we previously generated to train and apply the models. As our dataset only contained occurrence records, we generated three pseudo-absence datasets of 290 randomly generated pseudo-absence points in order to run the SDMs. We ran nine different SDMs (summarized in <xref ref-type="bibr" rid="B14">Franklin, 2010</xref>) to model <italic>C. dentata</italic> distribution. These models were ANN, CTA, FDA, GAM, GBM, GLM, MARS, MaxEnt, and RF.</p>
<p>Presence and pseudo-absence points were randomly split into 80% training data and 20% validation data. We ran five replications of all models with each of the three pseudo-absence datasets (totaling 15 replications of each model) to determine <italic>C. dentata</italic> species distribution and model statistics. Each model replication recorded True Skill Statistic (TSS) and Area-Under-the-Curve (AUC) error evaluation metrics which are commonly used to evaluate model performance (<xref ref-type="bibr" rid="B31">Swets, 1988</xref>; <xref ref-type="bibr" rid="B1">Allouche et al., 2006</xref>; <xref ref-type="bibr" rid="B11">Fawcett, 2006</xref>). Finally, we built ensemble-models of all generated SDMs to evaluate <italic>C. dentata</italic> distribution as informed by all nine modeling approaches. We generated three different predictions of <italic>C. dentata</italic> distribution using ensemble modeling outputs. These outputs were predicted probability of presence, binary predictions of presence/absence, and predicted probability of presence by committee agreement of binary individual model outputs. Binary predictions were made by creating a threshold of predicted probability to maximize sensitivity and specificity of the model. This means that a threshold was applied to probability predictions to maximize the true positive rate (sensitivity) and true negative rate (specificity) across the entire range. Pixels with probability values below the threshold were categorized as absences and pixels with probability values above the threshold were categorized as presences.</p>
</sec>
</sec>
<sec id="S3" sec-type="results">
<title>Results</title>
<sec id="S3.SS1">
<title>Comparison of models performance</title>
<p>Accuracy statistics for each species distribution model are summarized in <xref ref-type="table" rid="T2">Table 2</xref>. These values represent the mean values for AUC (area under the receiver operating characteristics curve) and TSS (True Skill Statistic) between all model replications. AUC values range from 0 to 1 and represents the probability that assigning a predicted suitability value at a random presence point is higher than assigning a predicted suitability value at a random absence point (<xref ref-type="bibr" rid="B11">Fawcett, 2006</xref>). Generally, AUC values between 0.6 and 0.7 are interpreted as &#x201C;fair&#x201D; models and AUC values between 0.7 and 0.8 are interpreted as &#x201C;good&#x201D; models (<xref ref-type="bibr" rid="B31">Swets, 1988</xref>; <xref ref-type="bibr" rid="B11">Fawcett, 2006</xref>). TSS ranges from &#x2212;1 to 1, with 0 representing a model that performs no better than random guesses (<xref ref-type="bibr" rid="B1">Allouche et al., 2006</xref>). Aside from the random forest models which showed an AUC and TSS value of 1,000, the gradient boosting model and classification tree analysis had the highest AUC and TSS values. AUC and TSS values of 1,000 indicate perfect agreement or fit between errors of sensitivity and specificity (<xref ref-type="bibr" rid="B1">Allouche et al., 2006</xref>) and we suspect model overfitting occurred in the random forest model, skewing its results. Artificial neural networks and generalized linear models had the lowest AUC and TSS values.</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>Species distribution model performance based on each model&#x2019;s validation data against the test data.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Technique</td>
<td valign="top" align="center">AUC<xref ref-type="table-fn" rid="t2fns1">&#x002A;</xref></td>
<td valign="top" align="center">TSS<xref ref-type="table-fn" rid="t2fns1">&#x002A;</xref></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ANN</td>
<td valign="top" align="center">0.705</td>
<td valign="top" align="center">0.367</td>
</tr>
<tr>
<td valign="top" align="left">CTA</td>
<td valign="top" align="center">0.836</td>
<td valign="top" align="center">0.592</td>
</tr>
<tr>
<td valign="top" align="left">FDA</td>
<td valign="top" align="center">0.731</td>
<td valign="top" align="center">0.343</td>
</tr>
<tr>
<td valign="top" align="left">GAM</td>
<td valign="top" align="center">0.729</td>
<td valign="top" align="center">0.367</td>
</tr>
<tr>
<td valign="top" align="left">GBM</td>
<td valign="top" align="center">0.877</td>
<td valign="top" align="center">0.616</td>
</tr>
<tr>
<td valign="top" align="left">GLM</td>
<td valign="top" align="center">0.719</td>
<td valign="top" align="center">0.351</td>
</tr>
<tr>
<td valign="top" align="left">MARS</td>
<td valign="top" align="center">0.730</td>
<td valign="top" align="center">0.371</td>
</tr>
<tr>
<td valign="top" align="left">MaxEnt</td>
<td valign="top" align="center">0.752</td>
<td valign="top" align="center">0.405</td>
</tr>
<tr>
<td valign="top" align="left">RF</td>
<td valign="top" align="center">1.000</td>
<td valign="top" align="center">1.000</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="t2fns1"><p>&#x002A;Area under the curve (AUC) and true skill statistic (TSS) represent mean values over all model replications. Higher AUC values indicate better model performance and TSS values greater than zero indicate model performance better than random guesses. SDM technique abbreviations are listed in see section &#x201C;species distribution modeling.&#x201D;</p></fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S3.SS2">
<title>Environmental variables importance</title>
<p>We generated variable importance measurements for all environmental variables in each SDM technique to identify the most important predictors for <italic>C. dentata</italic> habitat (<xref ref-type="table" rid="T3">Table 3</xref>). Sand to clay ratio of the soil was the most frequent top predictor of <italic>C. dentata</italic> distribution, identified in eight out of nine models. These models identified a positive relationship between sand to clay ratio of the soil and probability of <italic>C. dentata</italic> presence (indicating higher probability of American chestnut presence in sandier soils). Canopy cover (identified as important in five of eight models) showed a positive relationship with probability of <italic>C. dentata</italic> presence (indicating higher probability of American chestnut presence in areas of denser canopies). Other important variables included topographic convergence index (three models with a negative relationship) and topographic position index (three models with a positive relationship). The negative relationship with topographic convergence index indicates lower probability of American chestnut presence in areas of higher water accumulation, while the positive relationship with topographic position index indicates a higher probability of American chestnut presence along peaks and ridgeline formations. The artificial neural network model differed the most from other models in identification of variable importance, indicating canopy cover, distance to streams, and elevation as the three most important variables for modeling <italic>C. dentata</italic> distribution.</p>
<table-wrap position="float" id="T3">
<label>TABLE 3</label>
<caption><p>Environmental variables used for SDMs and their median permutation importance&#x002A; for each SDM technique over five replications.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left"></td>
<td valign="top" align="center" colspan="3">Modeling technique</td>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
</tr>
<tr>
<td/>
<td valign="top" align="center" colspan="9"><hr/></td>
</tr>
<tr>
<td valign="top" align="left">Predictor</td>
<td valign="top" align="center">ANN</td>
<td valign="top" align="center">CTA</td>
<td valign="top" align="center">FDA</td>
<td valign="top" align="center">GAM</td>
<td valign="top" align="center">GBM</td>
<td valign="top" align="center">GLM</td>
<td valign="top" align="center">MARS</td>
<td valign="top" align="center">MaxEnt</td>
<td valign="top" align="center">RF</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Canopy cover</td>
<td valign="top" align="center" style="background-color: #199747;">0.408</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.073</td>
<td valign="top" align="center" style="background-color: #199747;">0.040</td>
<td valign="top" align="center" style="background-color: #199747;">0.083</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center" style="background-color: #199747;">0.146</td>
<td valign="top" align="center" style="background-color: #199747;">0.076</td>
</tr>
<tr>
<td valign="top" align="left">Distance to streams</td>
<td valign="top" align="center" style="background-color: #ff0000;">0.500</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.042</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.021</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.069</td>
<td valign="top" align="center">0.029</td>
</tr>
<tr>
<td valign="top" align="left">Elevation</td>
<td valign="top" align="center" style="background-color: #199747;">0.462</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.036</td>
<td valign="top" align="center" style="background-color: #199747;">0.046</td>
</tr>
<tr>
<td valign="top" align="left">Slope</td>
<td valign="top" align="center">0.072</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center" style="background-color: #199747;">0.084</td>
<td valign="top" align="center">0.048</td>
<td valign="top" align="center">0.028</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.048</td>
<td valign="top" align="center">0.036</td>
</tr>
<tr>
<td valign="top" align="left">Aspect</td>
<td valign="top" align="center">0.005</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.016</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.030</td>
<td valign="top" align="center">0.025</td>
</tr>
<tr>
<td valign="top" align="left">Curvature</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.020</td>
<td valign="top" align="center">0.061</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.046</td>
<td valign="top" align="center">0.028</td>
</tr>
<tr>
<td valign="top" align="left">Topographic convergence index</td>
<td valign="top" align="center">0.085</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.058</td>
<td valign="top" align="center" style="background-color: #ff0000;">0.031</td>
<td valign="top" align="center" style="background-color: #ff0000;">0.099</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center" style="background-color: #ff0000;">0.116</td>
<td valign="top" align="center">0.044</td>
</tr>
<tr>
<td valign="top" align="left">Topographic position index</td>
<td valign="top" align="center">0.271</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center" style="background-color: #199747;">0.134</td>
<td valign="top" align="center" style="background-color: #199747;">0.140</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center" style="background-color: #199747;">0.106</td>
<td valign="top" align="center">0.092</td>
<td valign="top" align="center">0.035</td>
</tr>
<tr>
<td valign="top" align="left">Topographic relative moisture index</td>
<td valign="top" align="center">0.093</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center" style="background-color: #199747;">0.136</td>
<td valign="top" align="center">0.009</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.076</td>
<td valign="top" align="center">0.024</td>
</tr>
<tr>
<td valign="top" align="left">Sand to clay ratio</td>
<td valign="top" align="center">0.053</td>
<td valign="top" align="center" style="background-color: #199747;">0.865</td>
<td valign="top" align="center" style="background-color: #199747;">0.567</td>
<td valign="top" align="center" style="background-color: #199747;">0.440</td>
<td valign="top" align="center" style="background-color: #199747;">0.582</td>
<td valign="top" align="center" style="background-color: #199747;">0.567</td>
<td valign="top" align="center" style="background-color: #199747;">0.595</td>
<td valign="top" align="center" style="background-color: #199747;">0.429</td>
<td valign="top" align="center" style="background-color: #199747;">0.293</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="t3fns1"><p>&#x002A;Permutation importance values range from 0.000 (least important or not selected by the model) to 1.000 (most important). The three most important variables for each model are highlighted and colored to display the nature of their relationship with probability of <italic>Castanea dentata</italic> presence (Green = Positive relationship, Red = Negative relationship). SDM technique abbreviations are listed in see section &#x201C;species distribution modeling.&#x201D;</p></fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S3.SS3">
<title>Ensemble model predictions</title>
<p>We generated three maps representing predicted <italic>C. dentata</italic> species distribution across Pennsylvania through ensemble modeling of all SDMs. The first map (<xref ref-type="fig" rid="F1">Figure 1A</xref>) displays the mean of SDM predictions with the influence of each SDM being weighted by the TSS of that model. This means that all models contributed to determining the suitability of each pixel, but models determined to be more accurate had more influence than models determined to be less accurate. This figure shows higher values of distribution probability in areas such as Pike County in the northeast of the state and along the Spine of Appalachia throughout the center of the state. High areas of <italic>C. dentata</italic> suitability can also be seen in Allegheny National Forest and along Lake Erie in the northwest of the state.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Three panel map showing mean SDM predictions of <italic>Castanea dentata</italic> distribution weighted by TSS scores of each individual model <bold>(A)</bold>, modeled binary presence/absence of <italic>Castanea dentata</italic> in Pennsylvania, with a threshold to maximize sensitivity and specificity of the model <bold>(B)</bold>, and SDM techniques committee agreement of species distribution of <italic>Castanea dentata</italic> <bold>(C)</bold>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fevo-10-942766-g001.tif"/>
</fig>
<p><xref ref-type="fig" rid="F1">Figure 1B</xref> represents a model of binary presence/absence of <italic>C. dentata</italic> in Pennsylvania based on the combined ensemble model of SDMs. This model defined a suitability threshold in order to maximize the sensitivity and specificity of the prediction. All pixels with suitability values lower than the optimal threshold were defined as absences and all pixels with suitability values higher than the optimal threshold were defined as presences. We determined the optimal threshold to be a value of 0.58, which &#x2013; when tested with the presence and pseudo-absence points &#x2013; resulted in 74 true positives, 786 true negatives, 75 false positives, and 23 false negative. According to this binary presence/absence, approximately 16,088 square kilometers of Pennsylvania are considered suitable for <italic>C. dentata</italic> occupancy.</p>
<p>Finally, we generated a model of predicted spatial distribution of <italic>C. dentata</italic> in Pennsylvania determined by committee agreement of individual models (<xref ref-type="fig" rid="F1">Figure 1C</xref>). This model was generated by first creating binary presence/absence models for each individual SDM technique, applying a threshold to maximize sensitivity and specificity. Each pixel of the final model is then classified based on how many of the individual SDM techniques classify it as a presence or absence. If 0 models classify it as presence, there is a high consensus of absence. If 1&#x2013;3 models classify it as presence, there is low consensus of absence. If 4&#x2013;6 models classify it as presence, there is a low consensus of presence. If 7&#x2013;9 models classify it as presence, there is high consensus of presence.</p>
</sec>
</sec>
<sec id="S4" sec-type="discussion">
<title>Discussion</title>
<p>This study provides insight on the probable spatial distribution of <italic>C. dentata</italic> across Pennsylvania based on various habitat variables and the effect of different SDM techniques on modeling American chestnut habitat. Most models generally agree with other modeling studies indicating <italic>C. dentata</italic> suitability is best in high elevation ridgelines with sandy soils and low moisture (<xref ref-type="bibr" rid="B13">Fei et al., 2007</xref>; <xref ref-type="bibr" rid="B35">Tulowiecki, 2020</xref>). Eight out of the nine models we ran identified sand to clay ratio of the soil as the most important environmental variable, and furthermore indicated a positive relationship between probability of <italic>C. dentata</italic> presence and sand to clay ratio. Based on this result, we can confidently say that our models are in line with prior studies of <italic>C. dentata</italic> habitat modeling in addition to knowledge of American chestnut biology. Future modeling studies for <italic>C. dentata</italic> may consider including additional soil datasets to study this relationship further.</p>
<p>Five out of the nine models identified a canopy cover as one of the three most important environmental variables, and all identified a positive correlation between canopy coverage and probability of <italic>C. dentata</italic> presence. This finding suggests that American chestnut is more abundant in areas with a more developed overstory. This may be due to the ability of <italic>C. dentata</italic> to be competitive in lower light environments as it has a history as a generalist species able to thrive in a variety of forested conditions. This result may also reflect the current post-blight niche that smaller diameter surviving <italic>C. dentata</italic> trees occupy in eastern forests.</p>
<p>Three out of the nine models identified topographic position index as one of the three most important environmental variables and three out of the nine models identified topographic convergence index as one of the three most important variables. These models identified a positive and negative relationship, respectively between these metrics and <italic>C. dentata</italic> habitat suitability. These findings are supported by the literature and in line with chestnut biology, which indicates that American chestnut is most frequently found on ridgeline and peak topography and in areas of lower water accumulation.</p>
<p>Accuracy statistics such as AUC and TSS allow for model performance evaluation; our high model accuracy values obtained in this study add confidence to our habitat predictions. Most models had AUC values in the 0.7 to 0.8 range, indicating that they performed very well in accurately predicting <italic>C. dentata</italic> species distribution (<xref ref-type="table" rid="T2">Table 2</xref>). The artificial neural networks (ANN) model performed the worst out of the nine models with an AUC value of 0.705 and a TSS value of 0.367, suggesting that it is less informative for evaluating <italic>C. dentata</italic> distribution. Aside from the random forest model (AUC and TSS of 1,000), the best performing model was the gradient boosting model with an AUC value of 0.877 and a TSS value of 0.616. <xref ref-type="bibr" rid="B35">Tulowiecki (2020)</xref> also found the gradient boosting model to have the highest AUC of all SDMs when modeling species distribution of <italic>C. dentata</italic> in western New York. This suggests that gradient boosting models excel in predicting <italic>C. dentata</italic> distribution based on environmental data and future modeling attempts should include them. An AUC and TSS of 1,000 in the random forest model indicate model overfitting, so these results must be examined further. Because all SDMs utilized randomly generated pseudo-absences points as opposed to collecting true absence data in the field, our models may contain some introduced error. However, the use of multiple pseudo-absence datasets mitigates that error and given the high accuracy metrics of our results and consistency with other studies of <italic>C. dentata</italic> habitat, we believe that our results are meaningful and valid for habitat prediction.</p>
<p>The lower-performing ANN model also varied the most from other SDM techniques concerning environmental variable importance. This analysis identified canopy cover, distance to streams, and elevation as the three most important environmental variables for determining species distribution of <italic>C. dentata</italic>. Furthermore, this was the only model that did not identify sand to clay ratio as the most important variable and the only model to identify distance to streams as among the most important. ANN identified distance to streams as having a mean variable importance score of 0.500 &#x2013; the next highest median variable importance score for distance to streams was MaxEnt with 0.069. For elevation, the random forest model was the only other model to identify it as among the three most important environmental variables. While the ANN model identified the relationship between elevation and probability of <italic>C. dentata</italic> presence as positive, the relationship between elevation and probability of <italic>C. dentata</italic> presence identified by the random forest model was more complex, with suitability values varying considerably over the range of elevation values as compared to variation in other models. Even though no other models showed these same variable importance values or relationships, it does make ecological sense that American chestnut would be found at higher elevations further from streams where there would be less soil moisture. Because the ANN model performed the worst based on accuracy statistics and identified different environmental variables as the most important predictors of <italic>C. dentata</italic> habitat, it may be a less useful technique compared to other SDMs for evaluating suitable American chestnut habitat.</p>
<p>Overall, this study shows a variety of metrics explaining <italic>C. dentata</italic> species distribution and suggests the usefulness of ensemble modeling of SDMs. By utilizing nine different SDM techniques on the same dataset of species occurrences, pseudo-absences, and environmental variables, this study highlights differences between models that may not appear if models were considered separately. Ensemble modeling also allows for further confidence in results as it allows for direct comparison between different models, enabling the ability to predict whether the same areas are identified as suitable habitat or the relative influence of different environmental variables. For example, eight SDM techniques identified sand to clay ratio as the most important variable in modeling <italic>C. dentata</italic> distribution and all showed a positive relationship with probability of presence, thus adding weight to this finding. Even though the artificial neural networks model highlighted elevation as a highly important variable for predicting <italic>C. dentata</italic> distribution in Pennsylvania, the fact that none of the other modeling techniques identified this variable as particularly important suggests that that may be an error or a less significant relationship and we should study the relationship further.</p>
<p>Lastly, we acknowledge that the nature of the collection of <italic>C. dentata</italic> occurrence records may have introduced some bias into this study as most recorded American chestnut locations are often along trails or roads, where citizen scientists can more readily see the trees. Because we lack accurate data on sampling bias in this dataset, we used randomly generated pseudo-absence points to run the SDMs (<xref ref-type="bibr" rid="B24">Phillips et al., 2009</xref>). Additionally, this study is limited by the opacity of some of the modeling outputs. It would be beneficial to be able to explain differences in the mechanics and outputs of each individual modeling approach as it relates to the final ensemble modeling output. This is, however, addressed by the capacity of ensemble models to combine the strengths and weaknesses of the individual SDMs. Future research on <italic>C. dentata</italic> habitat modeling would benefit from using multi-model approaches that consider a broader set of environmental variables.</p>
</sec>
<sec id="S5" sec-type="conclusion">
<title>Conclusion</title>
<p>This methodology-focused paper presents some of the benefits of using SDM ensemble modeling when studying <italic>C. dentata</italic> distribution in Pennsylvania. We found that while individual SDM techniques generally picked out similar environmental variables as important predictors of habitat suitability, there was still variation in effect, importance, and accuracy. By combining SDM techniques through ensemble modeling, we can produce distribution maps weighted by accuracy metrics, allowing us to be more confident in the results. This streamlined process of ensemble modeling made possible through &#x201C;ShinyBIOMOD&#x201D; and the &#x201C;biomod2&#x201D; R package will be useful in assisting conservationists to both find more surviving American chestnut trees and confidently identify areas suitable to reintroduction efforts.</p>
</sec>
<sec id="S6" sec-type="data-availability">
<title>Data availability statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found here: Environmental variable sources are described in <xref ref-type="table" rid="T1">Table 1</xref> of the manuscript coming from <ext-link ext-link-type="uri" xlink:href="https://www.pasda.psu.edu/">https://www.pasda.psu.edu/</ext-link>, <ext-link ext-link-type="uri" xlink:href="https://www.mrlc.gov/data">https://www.mrlc.gov/data</ext-link>, and <ext-link ext-link-type="uri" xlink:href="https://www.isric.org/explore/isric-soil-data-hub">https://www.isric.org/explore/isric-soil-data-hub</ext-link>. The dataset of American chestnut locations is not publicly available per TACF data usage agreement.</p>
</sec>
<sec id="S7">
<title>Author contributions</title>
<p>AH: conceptualization, data collection, methodology, formal data analysis, and writing &#x2013; original draft. JS: conceptualization, data collection, methodology, formal data analysis, writing &#x2013; review and editing, and supervision. PK: methodology, writing &#x2013; review and editing, supervision, and conceptualization. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<sec id="S8" sec-type="funding-information">
<title>Funding</title>
<p>This work received funding from Villanova University&#x2019;s Falvey Memorial Library Scholarship Open Access Reserve (SOAR) Fund.</p>
</sec>
<ack><p>We thank Sara Fitzsimmons at The American Chestnut Foundation (TACF) for providing the input data for this research. We also would like to thank the reviewers of the manuscript for their helpful feedback.</p>
</ack>
<sec id="S9" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="S10" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Allouche</surname> <given-names>O.</given-names></name> <name><surname>Tsoar</surname> <given-names>A.</given-names></name> <name><surname>Kadmon</surname> <given-names>R.</given-names></name></person-group> (<year>2006</year>). <article-title>Assessing the accuracy of species distribution models: Prevalence, kappa and the true skill statistic (TSS).</article-title> <source><italic>J. Appl. Ecol.</italic></source> <volume>43</volume> <fpage>1223</fpage>&#x2013;<lpage>1232</lpage>. <pub-id pub-id-type="doi">10.1111/j.1365-2664.2006.01214.x</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barnes</surname> <given-names>J. C.</given-names></name> <name><surname>Delborne</surname> <given-names>J. A.</given-names></name></person-group> (<year>2019</year>). <article-title>Rethinking restoration targets for American chestnut using species distribution modeling.</article-title> <source><italic>Biodivers. Conserv.</italic></source> <volume>28</volume> <fpage>3199</fpage>&#x2013;<lpage>3220</lpage>.</citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beers</surname> <given-names>T. W.</given-names></name> <name><surname>Dress</surname> <given-names>P. E.</given-names></name> <name><surname>Wensel</surname> <given-names>L. C.</given-names></name></person-group> (<year>1966</year>). <article-title>Notes and observations: Aspect transformation in site productivity research.</article-title> <source><italic>J. For.</italic></source> <volume>64</volume> <fpage>691</fpage>&#x2013;<lpage>692</lpage>. <pub-id pub-id-type="doi">10.1093/jof/64.10.691</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Booth</surname> <given-names>T. H.</given-names></name></person-group> (<year>2018</year>). <article-title>Species distribution modelling tools and databases to assist managing forests under climate change.</article-title> <source><italic>For. Ecol. Manag.</italic></source> <volume>430</volume>:<issue>1960293</issue>. <pub-id pub-id-type="doi">10.1016/j.foreco.2018.08.019</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Collins</surname> <given-names>R. J.</given-names></name> <name><surname>Copenheaver</surname> <given-names>C. A.</given-names></name> <name><surname>Kester</surname> <given-names>M. E.</given-names></name> <name><surname>Barker</surname> <given-names>E. J.</given-names></name> <name><surname>DeBose</surname> <given-names>K. G.</given-names></name></person-group> (<year>2017</year>). <article-title>American Chestnut: Re-examining the historical attributes of a lost tree.</article-title> <source><italic>J. For.</italic></source> <volume>116</volume> <fpage>68</fpage>&#x2013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.5849/JOF-2016-014</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Diamond</surname> <given-names>S. J.</given-names></name> <name><surname>Giles</surname> <given-names>R. H.</given-names> <suffix>Jr.</suffix></name> <name><surname>Kirkpatrick</surname> <given-names>R. L.</given-names></name> <name><surname>Griffin</surname> <given-names>G. J.</given-names></name></person-group> (<year>2000</year>). <article-title>Hard mast production before and after the chestnut blight.</article-title> <source><italic>Southern J. Appl. For.</italic></source> <volume>24</volume> <fpage>196</fpage>&#x2013;<lpage>201</lpage>. <pub-id pub-id-type="doi">10.1093/sjaf/24.4.196</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dutech</surname> <given-names>C.</given-names></name> <name><surname>Barres</surname> <given-names>B.</given-names></name> <name><surname>Bridier</surname> <given-names>J.</given-names></name> <name><surname>Robin</surname> <given-names>C.</given-names></name> <name><surname>Milgroom</surname> <given-names>M. G.</given-names></name> <name><surname>Ravigne</surname> <given-names>V.</given-names></name></person-group> (<year>2012</year>). <article-title>The chestnut blight fungus world tour: Successive introduction events from diverse origins in an invasive plant fungal pathogen.</article-title> <source><italic>Mol. Ecol.</italic></source> <volume>21</volume> <fpage>3931</fpage>&#x2013;<lpage>3946</lpage>. <pub-id pub-id-type="doi">10.1111/j.1365-294X.2012.05575.x</pub-id> <pub-id pub-id-type="pmid">22548317</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elith</surname> <given-names>J.</given-names></name> <name><surname>Leathwick</surname> <given-names>J. R.</given-names></name></person-group> (<year>2009</year>). <article-title>Species distribution models: Ecological explanation and prediction across space and time.</article-title> <source><italic>Annu. Rev. Ecol. Evol. Syst.</italic></source> <volume>2009</volume> <fpage>677</fpage>&#x2013;<lpage>697</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.ecolsys.110308.120159</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><collab>Environmental Resources Research Institute</collab> (<year>1998</year>). <source><italic>Networked streams of Pennsylvania.</italic></source> <publisher-loc>Harrisburg, PA</publisher-loc>: <publisher-name>Environmental Resources Research Institute</publisher-name>.</citation></ref>
<ref id="B10"><citation citation-type="journal"><collab>ESRI</collab> (<year>2011</year>). <source><italic>ArcGIS desktop: Release 10.</italic></source> <publisher-loc>Redlands, CA</publisher-loc>: <publisher-name>Environmental Systems Research Institute</publisher-name>.</citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fawcett</surname> <given-names>T.</given-names></name></person-group> (<year>2006</year>). <article-title>An introduction to ROC analysis.</article-title> <source><italic>Pattern Recogn. Lett.</italic></source> <volume>27</volume> <fpage>861</fpage>&#x2013;<lpage>874</lpage>. <pub-id pub-id-type="doi">10.1016/j.patrec.2005.10.010</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fei</surname> <given-names>S.</given-names></name> <name><surname>Liang</surname> <given-names>L.</given-names></name> <name><surname>Paillet</surname> <given-names>F. L.</given-names></name> <name><surname>Steiner</surname> <given-names>K. C.</given-names></name> <name><surname>Fang</surname> <given-names>J.</given-names></name> <name><surname>Shen</surname> <given-names>Z.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>Modelling chestnut biogeography for American chestnut restoration: Chestnut biogeography.</article-title> <source><italic>Divers. Distrib.</italic></source> <volume>18</volume> <fpage>754</fpage>&#x2013;<lpage>768</lpage>. <pub-id pub-id-type="doi">10.1111/j.1472-4642.2012.00886.x</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fei</surname> <given-names>S.</given-names></name> <name><surname>Schibig</surname> <given-names>J.</given-names></name> <name><surname>Vance</surname> <given-names>M.</given-names></name></person-group> (<year>2007</year>). <article-title>Spatial habitat modeling of American chestnut at Mammoth Cave National Park.</article-title> <source><italic>For. Ecol. Manag.</italic></source> <volume>252</volume> <fpage>201</fpage>&#x2013;<lpage>207</lpage>. <pub-id pub-id-type="doi">10.1016/j.foreco.2007.06.036</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Franklin</surname> <given-names>J.</given-names></name></person-group> (<year>2010</year>). <source><italic>Mapping species distributions: Spatial inference and prediction.</italic></source> <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Homer</surname> <given-names>C. G.</given-names></name> <name><surname>Fry</surname> <given-names>J. A.</given-names></name> <name><surname>Barnes</surname> <given-names>C. A.</given-names></name></person-group> (<year>2012</year>). <source><italic>The National Land Cover Database. The National Land Cover Database (USGS Numbered Series No. 2012&#x2013;3020; Fact Sheet, Vols. 2012&#x2013;3020).</italic></source> <publisher-loc>Reston, VA</publisher-loc>: <publisher-name>U.S. Geological Survey</publisher-name>, <pub-id pub-id-type="doi">10.3133/fs20123020</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>MacDonald</surname> <given-names>W. L.</given-names></name> <name><surname>Cech</surname> <given-names>F. C.</given-names></name> <name><surname>Luchok</surname> <given-names>J.</given-names></name> <name><surname>Smith</surname> <given-names>C.</given-names></name></person-group> (<year>1978</year>). <source><italic>Proceedings of the American chestnut symposium.</italic></source> <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>USDA</publisher-name>.</citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Matthews</surname> <given-names>S. N.</given-names></name> <name><surname>Iverson</surname> <given-names>L. R.</given-names></name> <name><surname>Prasad</surname> <given-names>A. M.</given-names></name> <name><surname>Peters</surname> <given-names>M. P.</given-names></name> <name><surname>Rodewald</surname> <given-names>P. G.</given-names></name></person-group> (<year>2011</year>). <article-title>Modifying climate change habitat models using tree species-specific assessments of model uncertainty and life history-factors.</article-title> <source><italic>For. Ecol. Manag.</italic></source> <volume>262</volume> <fpage>1460</fpage>&#x2013;<lpage>1472</lpage>.</citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Milgroom</surname> <given-names>M. G.</given-names></name> <name><surname>Wang</surname> <given-names>K.</given-names></name> <name><surname>Zhou</surname> <given-names>Y.</given-names></name> <name><surname>Lipari</surname> <given-names>S. E.</given-names></name> <name><surname>Kaneko</surname> <given-names>S.</given-names></name></person-group> (<year>1996</year>). <article-title>Intercontinental population structure of the chestnut blight fungus, <italic>Cryphonectria parasitica</italic>.</article-title> <source><italic>Mycologia</italic></source> <volume>88</volume> <fpage>179</fpage>&#x2013;<lpage>190</lpage>. <pub-id pub-id-type="doi">10.2307/3760921</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Noah</surname> <given-names>P. H.</given-names></name> <name><surname>Cagle</surname> <given-names>N. L.</given-names></name> <name><surname>Westbrook</surname> <given-names>J. L.</given-names></name> <name><surname>Fitzsimmons</surname> <given-names>S. F.</given-names></name></person-group> (<year>2021</year>). <article-title>Identifying resilient restoration targets: Mapping and forecasting habitat suitability for <italic>Castanea dentata</italic> in Eastern USA under different climate-change scenarios.</article-title> <source><italic>Clim. Change Ecol.</italic></source> <volume>2</volume>:<issue>100037</issue>. <pub-id pub-id-type="doi">10.1016/j.ecochg.2021.100037</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nowacki</surname> <given-names>G. J.</given-names></name> <name><surname>Abrams</surname> <given-names>M. D.</given-names></name></person-group> (<year>1992</year>). <article-title>Community, edaphic, and historical analysis of mixed oak forests of the Ridge and Valley Province, in central Pennsylvania.</article-title> <source><italic>Can. J. For. Res.</italic></source> <volume>22</volume> <fpage>790</fpage>&#x2013;<lpage>800</lpage>.</citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paillet</surname> <given-names>F. L.</given-names></name></person-group> (<year>2002</year>). <article-title>Chestnut: History and ecology of a transformed species.</article-title> <source><italic>J. Biogeogr.</italic></source> <volume>29</volume> <fpage>1517</fpage>&#x2013;<lpage>1530</lpage>. <pub-id pub-id-type="doi">10.1046/j.1365-2699.2002.00767.x</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Parker</surname> <given-names>A. J.</given-names></name></person-group> (<year>1982</year>). <article-title>The topographic relative moisture index: An approach to soil-moisture assessment in mountain Terrain.</article-title> <source><italic>Phys. Geogr.</italic></source> <volume>3</volume> <fpage>160</fpage>&#x2013;<lpage>168</lpage>. <pub-id pub-id-type="doi">10.1080/02723646.1982.10642224</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><collab>Pennsylvania Spatial Data Access</collab> (<year>2022</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.pasda.psu.edu/">https://www.pasda.psu.edu/</ext-link> <comment>(accessed April 11, 2022)</comment>.</citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Phillips</surname> <given-names>S. J.</given-names></name> <name><surname>Dudik</surname> <given-names>M.</given-names></name> <name><surname>Elith</surname> <given-names>J.</given-names></name> <name><surname>Graham</surname> <given-names>C. H.</given-names></name> <name><surname>Lehmann</surname> <given-names>A.</given-names></name> <name><surname>Leathwick</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2009</year>). <article-title>Sample selection bias and presence-only distribution models: Implications for background and pseudo-absence data.</article-title> <source><italic>Ecol. Appl.</italic></source> <volume>19</volume> <fpage>181</fpage>&#x2013;<lpage>197</lpage>. <pub-id pub-id-type="doi">10.1890/07-2153.1</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poggio</surname> <given-names>L.</given-names></name> <name><surname>de Sousa</surname> <given-names>L. M.</given-names></name> <name><surname>Batjes</surname> <given-names>N. H.</given-names></name> <name><surname>Heuvelink</surname> <given-names>G. B. M.</given-names></name> <name><surname>Kempen</surname> <given-names>B.</given-names></name> <name><surname>Ribeiro</surname> <given-names>E.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>SoilGrids 2.0: Producing soil information for the globe with quantified spatial uncertainty.</article-title> <source><italic>Soil</italic></source> <volume>7</volume> <fpage>217</fpage>&#x2013;<lpage>240</lpage>. <pub-id pub-id-type="doi">10.5194/soil-7-217-2021</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><collab>R Core Team</collab> (<year>2013</year>). <source><italic>R: A language and environment for statistical computing.</italic></source> <publisher-loc>Vienna</publisher-loc>: <publisher-name>R Core Team</publisher-name>.</citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rigling</surname> <given-names>D.</given-names></name> <name><surname>Prospero</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title><italic>Cryphonectria parasitica</italic>, the causal agent of chestnut blight: Invasion history, population biology and disease control.</article-title> <source><italic>Mol. Plant Pathol.</italic></source> <volume>19</volume> <fpage>7</fpage>&#x2013;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.1111/mpp.12542</pub-id> <pub-id pub-id-type="pmid">28142223</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Russell</surname> <given-names>E. W. B.</given-names></name></person-group> (<year>1987</year>). <article-title>Pre-blight distribution of <italic>Castanea dentata</italic> (Marsh.) Borkh.</article-title> <source><italic>Bull. Torrey Bot. Club</italic></source> <volume>114</volume> <fpage>183</fpage>&#x2013;<lpage>190</lpage>.</citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steiner</surname> <given-names>K. C.</given-names></name> <name><surname>Carlson</surname> <given-names>J. E.</given-names></name></person-group> (<year>2006</year>). &#x201C;<article-title>Restoration of American Chestnut to Forest Lands</article-title>,&#x201D; in <source><italic>Proceedings of a Conference and Workshop Held May 4-6, 2004 at the North Carolina Arboretum. U.S Department of the Interior, National Park Service, National Capital Region, Center for Urban Ecology</italic></source>, (<publisher-loc>Asheville, NC</publisher-loc>). <pub-id pub-id-type="doi">10.2134/jeq2012.0368</pub-id> <pub-id pub-id-type="pmid">23673935</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steiner</surname> <given-names>K. C.</given-names></name> <name><surname>Westbrook</surname> <given-names>J. W.</given-names></name> <name><surname>Hebard</surname> <given-names>F. V.</given-names></name> <name><surname>Georgi</surname> <given-names>L. L.</given-names></name> <name><surname>Powell</surname> <given-names>W. A.</given-names></name> <name><surname>Fitzsimmons</surname> <given-names>S. F.</given-names></name></person-group> (<year>2017</year>). <article-title>Rescue of American chestnut with extraspecific genes following its destruction by a naturalized pathogen.</article-title> <source><italic>New For.</italic></source> <volume>48</volume> <fpage>317</fpage>&#x2013;<lpage>336</lpage>. <pub-id pub-id-type="doi">10.1007/s11056-016-9561-5</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Swets</surname> <given-names>J. A.</given-names></name></person-group> (<year>1988</year>). <article-title>Measuring the accuracy of diagnostic systems.</article-title> <source><italic>Science</italic></source> <volume>240</volume> <fpage>1285</fpage>&#x2013;<lpage>1293</lpage>. <pub-id pub-id-type="doi">10.1126/science.3287615</pub-id> <pub-id pub-id-type="pmid">3287615</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><collab>TACF</collab> (<year>2020</year>). <source><italic>Using Science to Save the American Chestnut Tree [WWW Document].</italic></source> <publisher-loc>Asheville, NC</publisher-loc>: <publisher-name>The American Chestnut Foundation</publisher-name>.</citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thuiller</surname> <given-names>W.</given-names></name> <name><surname>Georges</surname> <given-names>D.</given-names></name> <name><surname>Engler</surname> <given-names>R.</given-names></name> <name><surname>Breiner</surname> <given-names>F.</given-names></name></person-group> (<year>2016</year>). <source><italic>biomod2: Ensemble Platform for Species Distribution Modeling.</italic></source></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thuiller</surname> <given-names>W.</given-names></name> <name><surname>Lafourcade</surname> <given-names>B.</given-names></name> <name><surname>Engler</surname> <given-names>R.</given-names></name> <name><surname>Ara&#x00FA;jo</surname> <given-names>M. B.</given-names></name></person-group> (<year>2009</year>). <article-title>BIOMOD &#x2013; a platform for ensemble forecasting of species distributions.</article-title> <source><italic>Ecography</italic></source> <volume>32</volume> <fpage>369</fpage>&#x2013;<lpage>373</lpage>. <pub-id pub-id-type="doi">10.1111/j.1600-0587.2008.05742.x</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tulowiecki</surname> <given-names>S. J.</given-names></name></person-group> (<year>2020</year>). <article-title>Modeling the historical distribution of American chestnut (<italic>Castanea dentata</italic>) for potential restoration in western New York State.</article-title> <source><italic>U.S. For. Ecol. Manag.</italic></source> <volume>462</volume>:<issue>118003</issue>. <pub-id pub-id-type="doi">10.1016/j.foreco.2020.118003</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><collab>U.S. Geological Survey</collab> (<year>2000</year>). <source><italic>7.5 minute digital elevation models (DEM) for Pennsylvania (30 meter).</italic></source> <publisher-loc>Reston, VA</publisher-loc>: <publisher-name>U.S. Geological Survey</publisher-name>.</citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Westbrook</surname> <given-names>J. W.</given-names></name> <name><surname>Holliday</surname> <given-names>J. A.</given-names></name> <name><surname>Newhouse</surname> <given-names>A. E.</given-names></name> <name><surname>Powell</surname> <given-names>W. A.</given-names></name></person-group> (<year>2019</year>). <article-title>A plan to diversify a transgenic blight-tolerant American chestnut population using citizen science.</article-title> <source><italic>Plants People Planet</italic></source> <volume>2</volume> <fpage>84</fpage>&#x2013;<lpage>95</lpage>. <pub-id pub-id-type="doi">10.1002/ppp3.10061</pub-id></citation></ref>
</ref-list>
</back>
</article>