<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Ecol. Evol.</journal-id>
<journal-title>Frontiers in Ecology and Evolution</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Ecol. Evol.</abbrev-journal-title>
<issn pub-type="epub">2296-701X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fevo.2023.1193602</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Ecology and Evolution</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Determining representative pseudo-absences for invasive plant distribution modeling based on geographic similarity</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Xiao</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
<xref rid="aff3" ref-type="aff"><sup>3</sup></xref>
<xref rid="aff4" ref-type="aff"><sup>4</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/2244891/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Xu</surname>
<given-names>Quanli</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
<xref rid="aff3" ref-type="aff"><sup>3</sup></xref>
<xref rid="aff4" ref-type="aff"><sup>4</sup></xref>
<xref rid="c001" ref-type="corresp"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/2258952/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Liu</surname>
<given-names>Jing</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
<xref rid="aff3" ref-type="aff"><sup>3</sup></xref>
<xref rid="aff4" ref-type="aff"><sup>4</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/2258299/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Geography, Yunnan Normal University</institution>, <addr-line>Kunming</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>GIS Technology Engineering Research Centre for West-China Resources and Environment of Educational Ministry</institution>, <addr-line>Kunming</addr-line>, <country>China</country></aff>
<aff id="aff3"><sup>3</sup><institution>Yunnan Geospatial Information Technology Engineering Research Center</institution>, <addr-line>Kunming</addr-line>, <country>China</country></aff>
<aff id="aff4"><sup>4</sup><institution>Key Laboratory of Resources and Environment Remote Sensing in Yunnan University</institution>, <addr-line>Kunming</addr-line>, <country>China</country></aff>
<author-notes>
<fn id="fn0001" fn-type="edited-by"><p>Edited by: Paulo A. V. Borges, University of the Azores, Portugal</p></fn>
<fn id="fn0002" fn-type="edited-by"><p>Reviewed by: Gabor Pozsgai, University of the Azores, Portugal; Marco Girardello, Joint Research Centre, Italy</p></fn>
<corresp id="c001">&#x002A;Correspondence: Quanli Xu, <email>go2happiness@163.com</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>19</day>
<month>06</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>11</volume>
<elocation-id>1193602</elocation-id>
<history>
<date date-type="received">
<day>28</day>
<month>03</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>31</day>
<month>05</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2023 Wang, Xu and Liu.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Wang, Xu and Liu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<sec>
<title>Introduction</title>
<p>The use of pseudo-absence data constrained by environmental conditions can facilitate potential distribution predictions of invasive species. However, pseudo-absence data generated by existing methods are usually not representative because the relationship between the presence and pseudo-absence points is either simplistic or neglected. This could under or overestimate the potential distribution of invasive species.</p>
</sec>
<sec>
<title>Methods</title>
<p>To address this deficiency, this study proposes a new method for obtaining pseudo-absence data based on geographic similarities. First, the reliability of pseudo-absences was quantified based on the geographic similarity to the occurrence of species. Subsequently, a representative pseudo-absence reliability threshold interval was determined. Finally, different pseudo-absence acquisition methods were assessed by combining virtual species with a real invasive species.</p>
</sec>
<sec>
<title>Results</title>
<p>The analysis demonstrated that the geographic similarity method can improve model accuracy and achieve a more realistic distribution compared with the traditional method of sampling for pseudo-absence data.</p>
</sec>
<sec>
<title>Discussion</title>
<p>This result indicates that the pseudo-absence data obtained using the geographic similarity approach were more representative. Our study provides valuable insights into improving invasive plant distribution predictions by considering the geographical relationships between species occurrences and the surrounding environments.</p>
</sec>
</abstract>
<kwd-group>
<kwd>species distribution modeling</kwd>
<kwd>biological invasion</kwd>
<kwd>pseudo-absence</kwd>
<kwd>absence</kwd>
<kwd>representation</kwd>
</kwd-group>
<contract-sponsor id="cn1">National Natural Science Fund of China</contract-sponsor>
<counts>
<fig-count count="6"/>
<table-count count="0"/>
<equation-count count="5"/>
<ref-count count="54"/>
<page-count count="11"/>
<word-count count="6641"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Models in Ecology and Evolution</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="sec1" sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>Invasive plants cause a loss of species habitat and diversity, therefore, studying their distribution is important for ecological conservation (<xref ref-type="bibr" rid="ref31">Py&#x0161;ek et al., 2012</xref>; <xref ref-type="bibr" rid="ref3">Blackburn et al., 2019</xref>). Species distribution models (SDMs), also known as ecological niche models, use associations between the known occurrence (presence) of species and environmental conditions to estimate the potential geographic distribution of species, and have become a principal tool for studying the distribution of invasive species (<xref ref-type="bibr" rid="ref18">Guisan and Thuiller, 2005</xref>; <xref ref-type="bibr" rid="ref9">Elith et al., 2006</xref>; <xref ref-type="bibr" rid="ref11">Elith and Leathwick, 2009</xref>). The quality and representativeness of the distribution data applied to SDMs are vital, because samples that infer relationships between variables should be representative of the underlying population (<xref ref-type="bibr" rid="ref49">Zaniewski et al., 2002</xref>; <xref ref-type="bibr" rid="ref24">Lobo, 2008</xref>; <xref ref-type="bibr" rid="ref33">Rocchini et al., 2011</xref>; <xref ref-type="bibr" rid="ref38">Tessarolo et al., 2021</xref>). Depending on the distribution data of species used, SDMs can be classified into presence-only and presence-absence models (<xref ref-type="bibr" rid="ref6">Brotons et al., 2004</xref>; <xref ref-type="bibr" rid="ref9">Elith et al., 2006</xref>). Unlike presence-only models, presence-absence models require additional species-absence information to explore species-environment relationships. Moreover, several comparisons of various SDMs have demonstrated that presence-absence models tend to perform better than presence-only models (<xref ref-type="bibr" rid="ref6">Brotons et al., 2004</xref>; <xref ref-type="bibr" rid="ref12">Engler et al., 2004</xref>; <xref ref-type="bibr" rid="ref9">Elith et al., 2006</xref>; <xref ref-type="bibr" rid="ref17">Guisan et al., 2006</xref>). Presence data can be acquired through <italic>in situ</italic> collections, herbaria, or web-based information databases, such as the Global Biodiversity Information Facility (GBIF; <xref ref-type="bibr" rid="ref9">Elith et al., 2006</xref>; <xref ref-type="bibr" rid="ref10">Elith and Leathwick, 2007</xref>; <xref ref-type="bibr" rid="ref44">Wisz et al., 2008</xref>). However, these occurrence data may suffer from spatial bias (sample selection bias) because some sites are more likely to be surveyed or some species under recorded (<xref ref-type="bibr" rid="ref20">Hortal et al., 2007</xref>; <xref ref-type="bibr" rid="ref30">Phillips et al., 2009</xref>). Obtaining real absence data is sometimes challenging; thus, SDMs use random pseudo-absence (background or implied absence) data from the study area to reveal the environmental information available. Even with the inherent spatial bias in the collected data, pseudo-absence data selected through presence data is still useful, especially when available data records are rare (e.g., managing invasive or endangered species; <xref ref-type="bibr" rid="ref12">Engler et al., 2004</xref>; <xref ref-type="bibr" rid="ref29">Peterson et al., 2018</xref>). Furthermore, even if the quality of the presence data is high, pseudo-absence data should be thoroughly selected, as this is critical for the relative accuracy of SDMs (<xref ref-type="bibr" rid="ref25">Lobo et al., 2010</xref>; <xref ref-type="bibr" rid="ref36">Smith et al., 2013</xref>).</p>
<p>Two methods have been proposed for obtaining pseudo-absence data. The first involves random sampling of the entire geographic area, providing a broad representation of the environmental space (<xref ref-type="bibr" rid="ref37">Stockwell, 1999</xref>; <xref ref-type="bibr" rid="ref19">Hirzel et al., 2001</xref>). However, this method may generate abundant false absences, leading to erroneous predictions, particularly when records of species are scarce (<xref ref-type="bibr" rid="ref12">Engler et al., 2004</xref>; <xref ref-type="bibr" rid="ref25">Lobo et al., 2010</xref>). The second method samples a specific region rather than the entire area and can be delineated in geographic (geographic constraint method) or environmental space (environmental constraint method; <xref ref-type="bibr" rid="ref27">Lobo and Tognelli, 2011</xref>; <xref ref-type="bibr" rid="ref2">Barbet-Massin et al., 2012</xref>). This method increases the probability of obtaining absence data in places with environmental conditions different from those of the presence data, crucial for predicting the potential distribution of species (<xref ref-type="bibr" rid="ref25">Lobo et al., 2010</xref>). However, spatial distance buffers used to define geographic ranges are often arbitrary and subjective (<xref ref-type="bibr" rid="ref40">VanDerWal et al., 2009</xref>; <xref ref-type="bibr" rid="ref2">Barbet-Massin et al., 2012</xref>). An ecological approach known as the two-step method has been employed, which selects pseudo-absence data from unsuitable areas predicted by the BIOCLIM envelope model or ecological niche factor analysis based on presence data only (<xref ref-type="bibr" rid="ref12">Engler et al., 2004</xref>; <xref ref-type="bibr" rid="ref43">Wisz and Guisan, 2009</xref>). However, this method produces overly optimistic forecasts (<xref ref-type="bibr" rid="ref12">Engler et al., 2004</xref>). Without objective thresholds, pseudo-absence data tends to be selected within a &#x201C;narrow&#x201D; range, causing overestimated predictions with an underrepresented geographic sample and an overly broad distribution (<xref ref-type="bibr" rid="ref25">Lobo et al., 2010</xref>). Subsequently, a three-step pseudo-absence data selection technique that balances both geographic and environmental dimensions has been proposed to avoid such overestimations (<xref ref-type="bibr" rid="ref35">Senay et al., 2013</xref>; <xref ref-type="bibr" rid="ref21">Iturbide et al., 2015</xref>). This approach attempts to determine sampling geographic ranges based on changes in the importance of environmental variables, and subsequently applies classifiers and clusters to select representative pseudo-absence data. However, this approach introduces additional uncertainty when determining distances.</p>
<p>The use of pseudo-absence data is crucial for the distribution modeling of species, but to ensure accuracy, they must closely represent the geographical study area. Current pseudo-absence data-selection methods either under or overestimate potential distributions, both of which are detrimental to the management of invasive species. The difficulty in applying existing methods to determine a reasonable sampling range is the main driver behind the abovementioned underrepresentation. As the existing methods often involve explicit or implicit arbitrary assumptions, a &#x201C;smooth&#x201D; approach needs to be developed to obtain a reliable threshold. To this end, we proposed quantifying the environmental similarity between the unknown location and the occurrence location, as those with similar environmental characteristics are more likely to have similar distribution characteristics of species (<xref ref-type="bibr" rid="ref5">Broennimann et al., 2012</xref>; <xref ref-type="bibr" rid="ref39">Tocchio et al., 2015</xref>). This is known as the &#x201C;geographic similarity principle,&#x201D; namely, <italic>the more similar the geographical configuration of two points (regions), the more similar the value (process) of the target variable at these two points (regions)</italic> and has been shown to improve the reliability of distribution predictions of geographic phenomena (e.g., landslides and soils; <xref ref-type="bibr" rid="ref52">Zhu et al., 2018</xref>; <xref ref-type="bibr" rid="ref48">Xu et al., 2023b</xref>). Based on this principle, we proposed a method for obtaining pseudo-absence data of species that considers geographic similarity to improve the pseudo-absence data quality. By exploring the correlation between the distribution of species and the geographic environment, we calculated the confidence level of a location becoming absent based on the degree of similarity to known distribution locations of species. However, using pseudo-absences that are too far away (i.e., not similar) from the presence data may overestimate the potential distribution (<xref ref-type="bibr" rid="ref25">Lobo et al., 2010</xref>). Therefore, we introduced a new metric, the predictive efficiency index (PEI)&#x2014;discussed later&#x2014;to evaluate prediction overestimation.</p>
<p>The main aim of this study is to use the geographic similarity principle to improve the representation of pseudo-absence data, thus improving the potential distribution prediction of invasive species. To this end, we tested and compared the performance of traditional methods (&#x201C;random,&#x201D; &#x201C;geographic constraints,&#x201D; and &#x201C;environmental constraints&#x201D;) and a new geographic similarity-based approach (pseudo-absence selection method) in presence-absence models using virtual species (<xref ref-type="bibr" rid="ref28">Meynard et al., 2019</xref>) and a real case of <italic>Ageratina adenophora</italic> (Spreng.) R.M. King and H. Rob (Asterales, Asteraceae) distribution in Yunnan, China. Specifically, we applied each of the four pseudo-absence selection methods described above to virtual species, as well as real <italic>A. adenophora</italic> SDMs, and tested their performance under different biases and presence numbers (30, 50, 100, and 300). We validated the new methods by comparing the model results with the known (virtual) distribution suitability of species.</p>
</sec>
<sec id="sec2" sec-type="materials|methods">
<label>2.</label>
<title>Materials and methods</title>
<sec id="sec3">
<label>2.1.</label>
<title>Research processes</title>
<p>This study consisted of three steps (<xref rid="fig1" ref-type="fig">Figure 1</xref>). First, the geographical environmental similarity between known presence points of species and unknown locations was computed to assess the reliability of the unknown locations as pseudo-absences. This reliability measure provides guidance for sampling pseudo-absences. Second, the impact of pseudo-absences was tested with varying levels of reliability on the prediction of invasive distributions of species. The aim was to understand how different levels of reliability influence the accuracy and effectiveness of the distribution models. Finally, the feasibility and effectiveness of the proposed method were compared with those of traditional approaches.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption><p>Research flow chart.</p></caption>
<graphic xlink:href="fevo-11-1193602-g001.tif"/>
</fig>
</sec>
<sec id="sec4">
<label>2.2.</label>
<title><italic>Ageratina adenophora</italic> and environmental variables</title>
<p>Yunnan Province (China), located on the border of Southwest China and covering a total area of 394,100&#x2009;km<sup>2</sup>, was selected as the study area (<xref rid="fig2" ref-type="fig">Figure 2</xref>). Its mountainous regions account for 84% of the total area which has a complex topography including rivers and lakes. Although Yunnan features a diverse ecological environment, as a border province with frequent foreign exchanges, it is vulnerable to biological invasions, which threaten its biodiversity and natural environment. <italic>A. adenophora</italic> is a successfully invasive plant species in Yunnan and ranked first among the 16 most important invasive alien species identified by China&#x2019;s State Environmental Protection Administration in 2003 (<xref ref-type="bibr" rid="ref50">Zhang et al., 2007</xref>). Here, we obtained 300 valid <italic>A. adenophora</italic> distribution points from the literature (<xref ref-type="bibr" rid="ref45">Xian et al., 2023</xref>; <xref rid="fig2" ref-type="fig">Figure 2</xref>). Nine environmental variables (Pearson&#x2019;s |<italic>r</italic>|&#x2009;&#x003C;&#x2009;0.8, reducing the effect of multicollinearity) associated with the growth and spread of <italic>A. adenophora</italic> were selected for analysis, including bioclimates from WorldClim (<xref ref-type="bibr" rid="ref14">Fick and Hijmans, 2017</xref>; bio2, bio9, bio14, bio15, bio16, and bio19), topsoil organic matter, and acid&#x2013;base conditions (toc, tph). A detailed description of the environmental variables is provided in <xref rid="sec21" ref-type="sec">Supplementary material</xref>.</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption><p>Map of the study area in Yunnan, China.</p></caption>
<graphic xlink:href="fevo-11-1193602-g002.tif"/>
</fig>
</sec>
<sec id="sec5">
<label>2.3.</label>
<title>Virtual species</title>
<p>Virtual species were modeled with known true distributions to validate the similarity approach. We generated two virtual species using the function <italic>generateSpFromPCA</italic> in the package <italic>virtualspecies</italic> v.1.5.1 (<xref ref-type="bibr" rid="ref23">Leroy et al., 2016</xref>) in R v.4.2.0 (<xref ref-type="bibr" rid="ref32">R Core Team, 2022</xref>), which creates different principal component axes based on the given environmental variables and defines the species response to the principal component axes. We selected the first two axes (explaining 80.34% of the environmental variables), set the ecological niche breadth of the species to &#x201C;wide,&#x201D; and fixed the slope &#x03B1; to &#x2212;0.1. Two virtual species with a prevalence of 0.4 and 0.7 were generated. Subsequently, we performed a binary (presence-absence) transformation of the distribution suitability of the virtual species using a &#x201C;logistic&#x201D; approach. The environmental variables used to generate and predict the distribution of virtual species were consistent with those used for <italic>A. adenophora</italic>. To investigate the effect of sampling bias and the size of the presence data on the model prediction accuracy we performed 10-fold subsampling (30, 50, 100, and 300 presences) for the potential distribution with and without bias. The bias weights were the spatial kernel densities of plant records (GBIF, <ext-link xlink:href="https://doi.org/10.15468/dl.p3pwxa" ext-link-type="uri">https://doi.org/10.15468/dl.p3pwxa</ext-link>) from Yunnan Province.</p>
</sec>
<sec id="sec6">
<label>2.4.</label>
<title>Modeling and evaluation</title>
<p>We modeled species distributions by a commonly used, presence-absence based generalized linear model (GLM) using the R package <italic>flexsdm</italic> v.1.3.3 (<xref ref-type="bibr" rid="ref41">Velazco et al., 2022</xref>). A key advantage of the GLM is its flexibility in accommodating different response variables. GLM can handle binary data (presence or absence) by using a binomial distribution with a logit link function that models the probability of occurrence based on environmental predictors. This is well suited for the distribution modeling of species, where the goal is to predict the presence or absence of a species in relation to environmental variables. Here, GLMs were developed using a 10-fold cross-validation approach, with the data randomly divided for each iteration into a training and test set (70% and 30% of the data, respectively).</p>
<p>Several evaluation metrics were employed to assess the performance of the model. Sensitivity, representing the proportion of correctly predicted presences, and specificity, indicating the proportion of correctly predicted absences, were calculated. True skill statistics (TSS) were computed using the formula: TSS&#x2009;=&#x2009;sensitivity + specificity &#x2212; 1 (<xref ref-type="bibr" rid="ref1">Allouche et al., 2006</xref>; <xref ref-type="bibr" rid="ref22">Jim&#x00E9;nez-Valverde and Lobo, 2007</xref>). The TSS provides an advantageous assessment of the model accuracy and was utilized to classify the presence or absence of prediction results, considering the maximum TSS value as the threshold. We also employed the area under the receiver operating characteristic (ROC) curve (AUC) as a summary measure of model performance (<xref ref-type="bibr" rid="ref1">Allouche et al., 2006</xref>). The AUC quantifies the overall fit of the model by plotting sensitivity against 1&#x2014;specificity over various thresholds. It ranges from 0.5 (representing a random model) to 1 (indicating a perfect fit). Schoener&#x2019;s <italic>D</italic> (D) was used to evaluate niche overlap or similarity, ranging from 0 (completely dissimilar ecological niche) to 1 (identical ecological niche; <xref ref-type="bibr" rid="ref34">Schoener, 1968</xref>). Higher values of D indicate a better prediction performance. To ensure the reliability of our evaluations, these indicators were calculated based on the known potential distribution of virtual species (but note that for the real species <italic>A. adenophora</italic>, the AUC and TSS were calculated based on pseudo-absence). We utilized the R packages <italic>ENMTools</italic> v.2.0.0 (<xref ref-type="bibr" rid="ref42">Warren et al., 2021</xref>) and <italic>PresenceAbsence</italic> v.1.1.11 (<xref ref-type="bibr" rid="ref15">Freeman and Moisen, 2008</xref>) to compute D and TSS, respectively.</p>
</sec>
<sec id="sec7">
<label>2.5.</label>
<title>True absence and pseudo-absence</title>
<p>We considered all points outside the potential distribution of the species as true absences. The methods below were used to generate 10,000 pseudo-absences for the GLM (<xref ref-type="bibr" rid="ref2">Barbet-Massin et al., 2012</xref>).</p>
<sec id="sec8">
<label>2.5.1.</label>
<title>Similarity-based pseudo-absence method</title>
<p>Similarity-based pseudo-absence was performed in three steps (<xref rid="fig3" ref-type="fig">Figure 3</xref>). First, the similarity of the geographic environment was calculated for all unknown locations and presence points in the study area. Second, the reliability of the pseudo-absence was calculated based on the similarity. Finally, the optimal reliability was determined and pseudo-absences were obtained.</p>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption><p>The three steps used to create the similarity-based pseudo-absence data. Step 1: calculate the similarity. Step 2: calculate the reliability of pseudo-absence. Step 3: determine the optimal reliability threshold <italic>R</italic>.</p></caption>
<graphic xlink:href="fevo-11-1193602-g003.tif"/>
</fig>
<p>Previous studies have used the Mahalanobis distance to measure similarity; however, this requires <italic>a priori</italic> prediction of the &#x201C;best observed value&#x201D; (<xref ref-type="bibr" rid="ref13">Farber and Kadmon, 2003</xref>). Here, the Kernel density was used to calculate the geographic environmental similarity between each unknown location and all presences (<xref ref-type="bibr" rid="ref51">Zhu et al., 2015</xref>, <xref ref-type="bibr" rid="ref52">2018</xref>, <xref ref-type="bibr" rid="ref53">2019</xref>; <xref ref-type="bibr" rid="ref47">Xu et al., 2021</xref>, <xref ref-type="bibr" rid="ref46">2023a</xref>). The environmental variables must be normalized prior to unifying their magnitudes.</p>
<p>First, using <xref ref-type="disp-formula" rid="EQ1">Equation (1)</xref>, we calculated the similarity <italic>S</italic><italic><sub>i</sub></italic><italic><sup>v</sup></italic> between each unknown location <italic>i</italic> (<italic>i</italic>&#x2009;=&#x2009;1, 2, 3, &#x2026;, <italic>k</italic>; <italic>k</italic> is the total number of all locations) and all presences <italic>j</italic> (<italic>j</italic>&#x2009;=&#x2009;1, 2, 3, &#x2026;, <italic>n</italic>; <italic>n</italic> is the number of presences) based on the <italic>v</italic><sub>th</sub> (<italic>v</italic>&#x2009;=&#x2009;1, 2, 3, &#x2026;, <italic>l</italic>; <italic>l</italic> is the total number of environmental variables) environmental variable, where <italic>e</italic><italic><sub>i</sub></italic><italic><sup>v</sup></italic> and <italic>e</italic><italic><sub>j</sub></italic><italic><sup>v</sup></italic> are the values of the unknown points <italic>i</italic> and <italic>j</italic>, respectively. The bandwidth <italic>h</italic> was determined using an empirical rule (<xref ref-type="bibr" rid="ref001">Liu et al., 2021</xref>) with <xref ref-type="disp-formula" rid="EQ2">Equation (2)</xref>, where <italic>&#x03C3;<sub>v</sub></italic> is the standard deviation of the <italic>v</italic><sub>th</sub> environmental variable. Subsequently, we combined all <italic>l</italic> environmental variables to compute the comprehensive similarity <italic>S<sub>i</sub></italic> of each unknown location to the presence data using <xref ref-type="disp-formula" rid="EQ3">Equation (3)</xref>. Where <italic>f</italic> denotes the integrated similarity calculation function, and we used the average function.</p>
<disp-formula id="EQ1"><label>(1)</label><mml:math id="M1"><mml:mrow><mml:msubsup><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mi>n</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover></mml:mstyle><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:msqrt><mml:mrow><mml:mn>2</mml:mn><mml:mi>&#x03C0;</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mi>exp</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mi>e</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi></mml:msubsup><mml:mo>&#x2212;</mml:mo><mml:msubsup><mml:mi>e</mml:mi><mml:mi>j</mml:mi><mml:mi>v</mml:mi></mml:msubsup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:msup><mml:mi>h</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="EQ2"><label>(2)</label><mml:math id="M2"><mml:mrow><mml:mi>h</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x03C3;</mml:mi><mml:mi>v</mml:mi></mml:msub><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mn>4</mml:mn><mml:mrow><mml:mn>3</mml:mn><mml:mi>n</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>0.2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></disp-formula>
<disp-formula id="EQ3"><label>(3)</label><mml:math id="M3"><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo>,</mml:mo><mml:mo>&#x2026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mi>S</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi></mml:msubsup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>The reliability of pseudo-absence and similarity of presence data are complementary concepts used to measure the reliability of pseudo-absence data based on the results of similarity calculations, as shown in <xref ref-type="disp-formula" rid="EQ4">Equation (4)</xref>:</p>
<disp-formula id="EQ4"><label>(4)</label> <mml:math id="M4"><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></disp-formula>
<p>where <italic>R<sub>i</sub></italic> is the reliability of unknown location <italic>i</italic> as a pseudo-absence. Similar to <italic>S<sub>i</sub></italic>, the value domain of <italic>R<sub>i</sub></italic> is [0, 1]. To test the prediction of the distribution of invasive species under different threshold constraints step by step, we tested <italic>k</italic> reliability intervals (i.e., reliability falling in [<italic>t</italic>&#x2009;&#x00D7;&#x2009;<italic>s</italic>, 1], <italic>t</italic>&#x2009;=&#x2009;0, 1, &#x2026;, <italic>k</italic>-1, <italic>k</italic>. <italic>k</italic>&#x2009;=&#x2009;1/s), where s is the step size (set as 0.05) and <italic>R</italic> (<italic>R</italic>&#x2009;=&#x2009;t&#x2009;&#x00D7;&#x2009;<italic>s</italic>) is the reliability threshold.</p>
<p>The model discrimination (AUC or TSS score) can be high when overpredicting, that is, when the area occupied by the predicted species is high with regard to the total study area. However, this does not imply the applicability and accuracy of the predictions because the model results should precisely predict most species to occur in the smallest area (&#x201C;parsimony rule&#x201D;; <xref ref-type="bibr" rid="ref12">Engler et al., 2004</xref>; <xref ref-type="bibr" rid="ref26">Lobo et al., 2008</xref>; <xref ref-type="bibr" rid="ref16">Garc&#x00ED;a-Rosell&#x00F3; et al., 2019</xref>). Inspired by this idea, we used the predicted efficiency index (PEI) to determine the optimal reliability threshold <italic>R</italic>. Similar to the AUC, the predicted efficiency index was obtained by computing the area under the curve, which is composed of the coordinates <italic>x<sub>i</sub></italic> and <italic>y<sub>i</sub></italic>, based on <xref ref-type="disp-formula" rid="EQ5">Equation (5)</xref>:</p>
<disp-formula id="EQ5"><label>(5)</label><mml:math id="M5"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:munderover></mml:mstyle><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfrac><mml:mo>,</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover><mml:mo>&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:munderover></mml:mstyle><mml:msub><mml:mi>N</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>Analogous to the reliability class classification, but with a smaller step of 0.01, the model-predicted values were classified into <italic>i</italic> classes (<italic>i&#x2009;=</italic> 1, 2, &#x2026; <italic>m</italic>). The vertical axis <italic>y<sub>i</sub></italic> represents the ratio of the known presence number <italic>N<sub>i</sub></italic> within rank <italic>i</italic> to the total number of presences, and the horizontal axis <italic>x<sub>i</sub></italic> represents the ratio of area <italic>A<sub>i</sub></italic> predicted as a presence within rank <italic>i</italic> to the total study area (equivalent to the number of raster cells). The area under the curve was calculated as the PEI and its value was within the interval [0, 1]. Higher values indicated higher applicability or accuracy of the prediction results. Therefore, it is important to ensure high PEI values and model accuracy (sensitivity). We considered <italic>R</italic> to be the best pseudo-absence reliability threshold when the mean of the two was maximum.</p>
</sec>
<sec id="sec9">
<label>2.5.2.</label>
<title>Traditional pseudo-absence sampling methods</title>
<p>Three traditional types of pseudo-absence data sampling were performed using the R package &#x201C;<italic>flexsdm</italic>&#x201D; (<xref ref-type="bibr" rid="ref41">Velazco et al., 2022</xref>). (1) The random method&#x2014;random selection from all the points in the geospatial background of the study area, excluding known presence points; (2) geographical constraint method&#x2014;sampling of the areas located 20&#x2009;km away from known presence points; and (3) environmental constraint method&#x2014;sampling of the results generated based on the envelope model BIOCLIM (<xref ref-type="bibr" rid="ref4">Booth et al., 2014</xref>).</p>
</sec>
</sec>
</sec>
<sec id="sec10" sec-type="results">
<label>3.</label>
<title>Results</title>
<sec id="sec11">
<label>3.1.</label>
<title><italic>Ageratina adenophora</italic> distribution model</title>
<p>The <italic>A. adenophora</italic> distribution model showed different responses to pseudo-absences, with different levels of reliability (<xref rid="fig4" ref-type="fig">Figure 4</xref>). As the reliability threshold <italic>R</italic> increased, the extent of highly suitable areas expanded, as represented by an increase in green areas. Relatively low thresholds (<xref rid="fig4" ref-type="fig">Figure 4A</xref>) were generally associated with low suitability. When the threshold was moderate (<xref rid="fig4" ref-type="fig">Figure 4B</xref>), a relatively &#x201C;smooth&#x201D; trend of high and low values was discerned. However, as the threshold continued to increase (<xref rid="fig4" ref-type="fig">Figure 4C</xref>), the suitability exhibited a &#x201C;bipolar&#x201D; (0 and maximum predicted value) distribution pattern. In terms of the model discrimination (<xref rid="fig5" ref-type="fig">Figure 5A</xref>), we found that the AUC and TSS increased as the threshold value increased. The PEI remained relatively flat at high values with low thresholds (<italic>R</italic>&#x2009;=&#x2009;0.0&#x2009;~&#x2009;0.2) but then decreased as the reliability threshold increased. Notably, the TSS (and sensitivity) exhibited significant instability at high thresholds (<italic>R</italic>&#x2009;=&#x2009;0.9). The mean values of predictive efficiency and sensitivity were greatest when the reliability threshold (<italic>R</italic>)&#x2009;=&#x2009;0.6; therefore, this was set as the optimal reliability threshold.</p>
<fig position="float" id="fig4">
<label>Figure 4</label>
<caption><p>The <italic>A. adenophora</italic> distribution model showed different responses to pseudo-absences with different levels of reliability. <bold>(A)</bold> When low threshold, the pseudo-absence data may contain a large number of false absences, predicting with high efficiency but underestimating the species distribution; <bold>(B)</bold> When medium threshold, some false absences are excluded and over- or underestimation is mitigated; <bold>(C)</bold> When high threshold, pseudo-absences are restricted to a tight range, with low prediction efficiency and overestimation of species distribution.</p></caption>
<graphic xlink:href="fevo-11-1193602-g004.tif"/>
</fig>
<fig position="float" id="fig5">
<label>Figure 5</label>
<caption><p>The selection of different pseudo-absence methods influenced the modeling accuracy of the <italic>A. adenophora</italic> distribution model. <bold>(A)</bold> Model accuracy (discrimination, calculated based on pseudo-absences and 300 presences) varies with the reliability threshold in the <italic>A. adenophora</italic> distribution model; <bold>(B)</bold> Comparison of the accuracy (TSS) of different pseudo-absence selection methods in the <italic>A. adenophora</italic> distribution model; <bold>(C)</bold> Comparison of the prediction efficiency (PEI) of different pseudo-absence selection methods in the <italic>A. adenophora</italic> distribution model. Differences in means were tested using <italic>T</italic>-test and ANOVA, respectively. The symbols &#x201C;&#x002A;&#x201D;, &#x201C;&#x002A;&#x002A;&#x201D;, &#x201C;&#x002A;&#x002A;&#x002A;&#x201D;, and &#x201C;&#x002A;&#x002A;&#x002A;&#x002A;&#x201D; represent significant <italic>p</italic>-values less than 0.05, 0.01, 0.001, and 0.0001, respectively.</p></caption>
<graphic xlink:href="fevo-11-1193602-g005.tif"/>
</fig>
<p>The selection of different pseudo-absence methods influenced the accuracy of the <italic>A. adenophora</italic> distribution model (<xref rid="fig5" ref-type="fig">Figures 5B</xref>,<xref rid="fig5" ref-type="fig">C</xref>). The environmental constraint and similarity methods substantially improved the modeling accuracy of TSS compared to the random and geographical constraint methods; however, these approaches resulted in a decrease in the PEI. The similarity method achieved a higher PEI than the environment-constrained method. Overall, the similarity method produced models with higher prediction accuracy and efficiency. The pseudo-absence sampling method, based on geographic similarity, has demonstrated notable advantages in terms of both model performance and accuracy. It exhibits greater stability in striking a balance between avoiding overestimation and making accurate predictions.</p>
</sec>
<sec id="sec12">
<label>3.2.</label>
<title>Virtual species distribution simulation</title>
<p>For the virtual SDMs (<xref rid="fig6" ref-type="fig">Figure 6</xref>), the pseudo-absence selection based on similarity achieved the best accuracy (TSS) and ecological realism (D) across different sample sizes (<xref rid="fig6" ref-type="fig">Figure 6A</xref>). The similarity method achieved the highest accuracy and most realistic representation of ecological niches, even for species with different distributions and prevalence rates (<xref rid="fig6" ref-type="fig">Figure 6B</xref>). For species with a higher prevalence, the environmentally constrained methods performed slightly higher in accuracy (TSS) but lower in ecological veracity than the similarity methods. Notably, although the ecological realism (D) attained by the similarity method was slightly lower than that of the random method under unbiased conditions, bias did not significantly affect the high accuracy (TSS) achieved by the similarity method (<xref rid="fig6" ref-type="fig">Figure 6C</xref>). Overall, compared with traditional approaches, pseudo-absence based on the similarity method yielded better model performance under various conditions.</p>
<fig position="float" id="fig6">
<label>Figure 6</label>
<caption><p>Performance (calculated based on known potential distributions) of different pseudo-absence methods in virtual species distribution models. Model accuracy (TSS) and ecological realism (D) for different pseudo-absence methods with different sample sizes <bold>(A)</bold>, prevalences <bold>(B)</bold>, and biases <bold>(C)</bold>; differences in means were tested using <italic>T</italic>-test and ANOVA, respectively. The symbols &#x201C;&#x002A;&#x201D;, &#x201C;&#x002A;&#x002A;&#x201D;, &#x201C;&#x002A;&#x002A;&#x002A;&#x201D;, and &#x201C;&#x002A;&#x002A;&#x002A;&#x002A;&#x201D; represent significant <italic>p</italic>-values less than 0.05, 0.01, 0.001, and 0.0001, respectively.</p></caption>
<graphic xlink:href="fevo-11-1193602-g006.tif"/>
</fig>
</sec>
</sec>
<sec id="sec13" sec-type="discussions">
<label>4.</label>
<title>Discussion</title>
<sec id="sec14">
<label>4.1.</label>
<title>The significance of quantifying pseudo-absence reliability and its implications in the distribution modeling of species</title>
<p>The results of the <italic>A. adenophora</italic> distribution model highlight the significant influence of the reliability threshold for pseudo-absence data on model quality. Increasing the reliability threshold improved both model discrimination (AUC) and prediction accuracy (TSS or sensitivity). However, it is important to exercise caution, as excessively high thresholds can lead to overprediction and confusion, reminiscent of the &#x201C;<italic>no elephants in Antarctica</italic>&#x201D; scenario (<xref rid="fig4" ref-type="fig">Figure 4C</xref>).</p>
<p>When the reliability of the pseudo-absence data was relatively low, the resulting geographic representation was extensive but resulted in a significant number of false absences, leading to lower model accuracy, particularly in terms of sensitivity. The model predictions resembled those of the random method and were restricted to a narrow range in the presence of known species. Consequently, the identification of potential species distribution areas was ineffective, despite the relatively high PEI values. Moderate reliability thresholds strike a balance by considering geographic representations, while filtering out false absences. This improved the modeling accuracy and alleviated underestimation, while maintaining high prediction efficiency. High reliability thresholds restricted pseudo-absence sampling to areas that were highly unfavorable for species survival. Consequently, the model can accurately distinguish between presence and absence, yielding high accuracy. However, such data only represent a limited geographic environment within non-invasive areas (extremely unfavorable places for the survival of a species), failing to capture the full complexity of the environment in those areas. Consequently, more spatial areas were classified as potential distribution areas for the invasive species, resulting in a lower PEI. Therefore, it is crucial to determine an appropriate reliability threshold to achieve optimal prediction effectiveness and accuracy.</p>
<p>The above process demonstrates that quantifying the reliability of pseudo-absence data using similarity represents a data-driven guideline for sampling reasonable pseudo-absence data. This approach provides valuable insight for future studies aimed at collecting reliable pseudo-absence data. To identify the optimal sampling threshold, we focused on the principle of accurately predicting species presence within smaller areas where possible. Our goal was to maximize both the sensitivity and PEI; therefore, we employed the concept of environmental similarity to species presence, interpreted as a measure of the uncertainty associated with pseudo-absences (<xref ref-type="bibr" rid="ref7">Buisson et al., 2010</xref>). This concept guided the selection of pseudo-absences, ensuring the reliability of the sampling approach.</p>
</sec>
<sec id="sec15">
<label>4.2.</label>
<title>The effectiveness of pseudo-absence data obtained via the geographic similarity method</title>
<p>Validation of the virtual SDM demonstrated the effectiveness of pseudo-absences based on similarity, which yielded significant improvements. The random method underestimates the distribution of species, especially with small sample sizes, resulting in low sensitivity. The environmental constraint approach sacrifices specificity to enhance sensitivity, as such method need to occupy more realistic absence sites to correctly predict presence sites (low PEI). The geographic constraint approach lies between the two, behaving like the random method when using short distances and resembling the environmental constraint method when using long distances.</p>
<p>In contrast to these approaches, the PEI enables a better balance between sensitivity and specificity within the similarity method. Similar to the trend observed for specificity, the PEI decreased as reliability increased. Because true absences are lacking, and true specificity cannot be calculated, maximizing the PEI and sensitivity can be viewed as a variation of maximizing specificity and sensitivity, namely, maximizing TSS, which has been shown to generate the most accurate distribution predictions. Thus, the similarity method achieved the highest precision in the virtual SDM.</p>
<p>Our findings confirm those of previous studies that obtained a reliable representation of the potential distribution of a species using pseudo-absences located near the external boundaries of the environmental niche occupied by that species (<xref ref-type="bibr" rid="ref8">Chefaoui and Lobo, 2008</xref>; <xref ref-type="bibr" rid="ref25">Lobo et al., 2010</xref>). Compared to traditional methods, the similarity method provides a more reasonable determination of this range. By considering the implications of quantifying pseudo-absence reliability and the effectiveness of the geographic similarity method, we gain valuable insights into improving the distribution modeling of species. These findings have important implications for conservation, invasive species management, and ecological research. Understanding the impact of pseudo-absence data reliability on model performance allows informed decisions when selecting appropriate thresholds and sampling methods.</p>
</sec>
<sec id="sec16">
<label>4.3.</label>
<title>Conclusion and further efforts</title>
<p>Our study introduced a novel method that utilizes geographic similarity to obtain representative pseudo-absence data for the distribution modeling of invasive species. By considering the relationship between species distribution and the geographic environment, we quantified the reliability of pseudo-absence data and predicted the distribution of the invasive plant <italic>A. adenophora</italic> in Yunnan Province, China. This approach was further validated using virtual species. Our analysis demonstrates that the similarity-based method enhances the representativeness of pseudo-absence data and improves predictive accuracy. This has important implications for conservation management, ensuring effective protection of rare species and management of invasive species. By quantifying the pseudo-absence reliability and incorporating geographic constraints, our approach improves the accuracy and reliability of SDMs, providing valuable information for conservation planning and biodiversity assessments. However, addressing the potential spatial bias in the sample data remains a challenge that requires further consideration and ongoing efforts to improve spatial representation. In conclusion, our research highlights the importance of quantifying pseudo-absence reliability and demonstrates the effectiveness of the geographic similarity method in the distribution modeling of species, offering insights for biodiversity conservation and management strategies. Future studies should validate and explore these approaches in different ecological contexts to advance our understanding of species-environment relationships and conservation efforts.</p>
</sec>
</sec>
<sec id="sec17" sec-type="data-availability">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref rid="sec21" ref-type="sec">Supplementary material</xref>, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="sec18">
<title>Author contributions</title>
<p>XW, QX, and JL conceived and designed the experiments. XW performed all experiments, analyzed the data, and wrote the manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="sec19" sec-type="funding-information">
<title>Funding</title>
<p>Funding was provided by the National Natural Science Fund of China (Grant Nos. 42161065 and 41461038) and Graduate Research Innovation Fund of Yunnan Normal University.</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="sec100" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack>
<p>The authors would like to express their sincere gratitude to the editors and reviewers who invested considerable time and effort into their comments on this paper. The authors have gained useful insights from and would like to express their sincere gratitude to Prof. A-Xing Zhu for his lecture &#x201C;Condensation of scientific problems and writing of SCI papers and grant projects&#x201D;. They would also like to thank Editage (<ext-link xlink:href="http://www.editage.cn" ext-link-type="uri">www.editage.cn</ext-link>) for English language editing.</p>
</ack>
<sec id="sec21" sec-type="supplementary-material">
<title>Supplementary material</title>
<p>The Supplementary material for this article can be found online at: <ext-link xlink:href="https://www.frontiersin.org/articles/10.3389/fevo.2023.1193602/full#supplementary-material" ext-link-type="uri">https://www.frontiersin.org/articles/10.3389/fevo.2023.1193602/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.ZIP" id="SM1" mimetype="application/zip" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Allouche</surname> <given-names>O.</given-names></name> <name><surname>Tsoar</surname> <given-names>A.</given-names></name> <name><surname>Kadmon</surname> <given-names>R.</given-names></name></person-group> (<year>2006</year>). <article-title>Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS): Assessing the accuracy of distribution models</article-title>. <source>J. Appl. Ecol.</source> <volume>43</volume>, <fpage>1223</fpage>&#x2013;<lpage>1232</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1365-2664.2006.01214.x</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barbet-Massin</surname> <given-names>M.</given-names></name> <name><surname>Jiguet</surname> <given-names>F.</given-names></name> <name><surname>Albert</surname> <given-names>C. H.</given-names></name> <name><surname>Thuiller</surname> <given-names>W.</given-names></name></person-group> (<year>2012</year>). <article-title>Selecting pseudo-absences for species distribution models: how, where and how many?</article-title> <source>Methods Ecol. Evol.</source> <volume>3</volume>, <fpage>327</fpage>&#x2013;<lpage>338</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.2041-210X.2011.00172.x</pub-id></citation></ref>
<ref id="ref3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blackburn</surname> <given-names>T. M.</given-names></name> <name><surname>Bellard</surname> <given-names>C.</given-names></name> <name><surname>Ricciardi</surname> <given-names>A.</given-names></name></person-group> (<year>2019</year>). <article-title>Alien versus native species as drivers of recent extinctions</article-title>. <source>Front. Ecol. Environ.</source> <volume>17</volume>, <fpage>203</fpage>&#x2013;<lpage>207</lpage>. doi: <pub-id pub-id-type="doi">10.1002/fee.2020</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Booth</surname> <given-names>T. H.</given-names></name> <name><surname>Nix</surname> <given-names>H. A.</given-names></name> <name><surname>Busby</surname> <given-names>J. R.</given-names></name> <name><surname>Hutchinson</surname> <given-names>M. F.</given-names></name></person-group> (<year>2014</year>). <article-title><sc>bioclim</sc>: the first species distribution modelling package, its early applications and relevance to most current M<sc>ax</sc>E<sc>nt</sc> studies</article-title>. <source>Divers Distrib.</source> <volume>20</volume>, <fpage>1</fpage>&#x2013;<lpage>9</lpage>. doi: <pub-id pub-id-type="doi">10.1111/ddi.12144</pub-id></citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Broennimann</surname> <given-names>O.</given-names></name> <name><surname>Fitzpatrick</surname> <given-names>M. C.</given-names></name> <name><surname>Pearman</surname> <given-names>P. B.</given-names></name> <name><surname>Petitpierre</surname> <given-names>B.</given-names></name> <name><surname>Pellissier</surname> <given-names>L.</given-names></name> <name><surname>Yoccoz</surname> <given-names>N. G.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Measuring ecological niche overlap from occurrence and spatial environmental data: Measuring niche overlap</article-title>. <source>Glob. Ecol. Biogeogr.</source> <volume>21</volume>, <fpage>481</fpage>&#x2013;<lpage>497</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1466-8238.2011.00698.x</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brotons</surname> <given-names>L.</given-names></name> <name><surname>Thuiller</surname> <given-names>W.</given-names></name> <name><surname>Ara&#x00FA;jo</surname> <given-names>M. B.</given-names></name> <name><surname>Hirzel</surname> <given-names>A. H.</given-names></name></person-group> (<year>2004</year>). <article-title>Presence-absence versus presence-only modelling methods for predicting bird habitat suitability</article-title>. <source>Ecography</source> <volume>27</volume>, <fpage>437</fpage>&#x2013;<lpage>448</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.0906-7590.2004.03764.x</pub-id></citation></ref>
<ref id="ref7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Buisson</surname> <given-names>L.</given-names></name> <name><surname>Thuiller</surname> <given-names>W.</given-names></name> <name><surname>Casajus</surname> <given-names>N.</given-names></name> <name><surname>Lek</surname> <given-names>S.</given-names></name> <name><surname>Grenouillet</surname> <given-names>G.</given-names></name></person-group> (<year>2010</year>). <article-title>Uncertainty in ensemble forecasting of species distribution</article-title>. <source>Glob. Chang. Biol.</source> <volume>16</volume>, <fpage>1145</fpage>&#x2013;<lpage>1157</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1365-2486.2009.02000.x</pub-id></citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chefaoui</surname> <given-names>R. M.</given-names></name> <name><surname>Lobo</surname> <given-names>J. M.</given-names></name></person-group> (<year>2008</year>). <article-title>Assessing the effects of pseudo-absences on predictive distribution model performance</article-title>. <source>Ecol. Model.</source> <volume>210</volume>, <fpage>478</fpage>&#x2013;<lpage>486</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ecolmodel.2007.08.010</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elith</surname> <given-names>J.</given-names></name> <name><surname>Graham</surname> <given-names>C. H.</given-names></name> <name><surname>Anderson</surname> <given-names>R.</given-names></name> <name><surname>Dud&#x00ED;k</surname> <given-names>M.</given-names></name> <name><surname>Ferrier</surname> <given-names>S.</given-names></name> <name><surname>Guisan</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2006</year>). <article-title>Novel methods improve prediction of species&#x2019; distributions from occurrence data</article-title>. <source>Ecography</source> <volume>29</volume>, <fpage>129</fpage>&#x2013;<lpage>151</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.2006.0906-7590.04596.x</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elith</surname> <given-names>J.</given-names></name> <name><surname>Leathwick</surname> <given-names>J.</given-names></name></person-group> (<year>2007</year>). <article-title>Predicting species distributions from museum and herbarium records using multiresponse models fitted with multivariate adaptive regression splines</article-title>. <source>Divers. Distrib.</source> <volume>13</volume>, <fpage>265</fpage>&#x2013;<lpage>275</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1472-4642.2007.00340.x</pub-id></citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elith</surname> <given-names>J.</given-names></name> <name><surname>Leathwick</surname> <given-names>J. R.</given-names></name></person-group> (<year>2009</year>). <article-title>Species Distribution Models: Ecological Explanation and Prediction Across Space and Time</article-title>. <source>Annu. Rev. Ecol. Evol. Syst.</source> <volume>40</volume>, <fpage>677</fpage>&#x2013;<lpage>697</lpage>. doi: <pub-id pub-id-type="doi">10.1146/annurev.ecolsys.110308.120159</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Engler</surname> <given-names>R.</given-names></name> <name><surname>Guisan</surname> <given-names>A.</given-names></name> <name><surname>Rechsteiner</surname> <given-names>L.</given-names></name></person-group> (<year>2004</year>). <article-title>An improved approach for predicting the distribution of rare and endangered species from occurrence and pseudo-absence data</article-title>. <source>J. Appl. Ecol.</source> <volume>41</volume>, <fpage>263</fpage>&#x2013;<lpage>274</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.0021-8901.2004.00881.x</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Farber</surname> <given-names>O.</given-names></name> <name><surname>Kadmon</surname> <given-names>R.</given-names></name></person-group> (<year>2003</year>). <article-title>Assessment of alternative approaches for bioclimatic modeling with special emphasis on the Mahalanobis distance</article-title>. <source>Ecol. Model.</source> <volume>160</volume>, <fpage>115</fpage>&#x2013;<lpage>130</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0304-3800(02)00327-7</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fick</surname> <given-names>S. E.</given-names></name> <name><surname>Hijmans</surname> <given-names>R. J.</given-names></name></person-group> (<year>2017</year>). <article-title>WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas</article-title>. <source>Int. J. Climatol.</source> <volume>37</volume>, <fpage>4302</fpage>&#x2013;<lpage>4315</lpage>. doi: <pub-id pub-id-type="doi">10.1002/joc.5086</pub-id></citation></ref>
<ref id="ref15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Freeman</surname> <given-names>E. A.</given-names></name> <name><surname>Moisen</surname> <given-names>G.</given-names></name></person-group> (<year>2008</year>). <article-title>PresenceAbsence: An R Package for Presence Absence Analysis</article-title>. <source>J. Stat. Softw.</source> <volume>23</volume>, <fpage>1</fpage>&#x2013;<lpage>31</lpage>. doi: <pub-id pub-id-type="doi">10.18637/jss.v023.i11</pub-id></citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Garc&#x00ED;a-Rosell&#x00F3;</surname> <given-names>E.</given-names></name> <name><surname>Guisande</surname> <given-names>C.</given-names></name> <name><surname>Gonz&#x00E1;lez-Vilas</surname> <given-names>L.</given-names></name> <name><surname>Gonz&#x00E1;lez-Dacosta</surname> <given-names>J.</given-names></name> <name><surname>Heine</surname> <given-names>J.</given-names></name> <name><surname>P&#x00E9;rez-Costas</surname> <given-names>E.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>A simple method to estimate the probable distribution of species</article-title>. <source>Ecography</source> <volume>42</volume>, <fpage>1613</fpage>&#x2013;<lpage>1622</lpage>. doi: <pub-id pub-id-type="doi">10.1111/ecog.04563</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guisan</surname> <given-names>A.</given-names></name> <name><surname>Lehmann</surname> <given-names>A.</given-names></name> <name><surname>Ferrier</surname> <given-names>S.</given-names></name> <name><surname>Austin</surname> <given-names>M.</given-names></name> <name><surname>Overton</surname> <given-names>J. M.</given-names></name> <name><surname>Aspinall</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2006</year>). <article-title>Making better biogeographical predictions of species&#x2019; distributions</article-title>. <source>J. Appl. Ecol.</source> <volume>43</volume>, <fpage>386</fpage>&#x2013;<lpage>392</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1365-2664.2006.01164.x</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guisan</surname> <given-names>A.</given-names></name> <name><surname>Thuiller</surname> <given-names>W.</given-names></name></person-group> (<year>2005</year>). <article-title>Predicting species distribution: offering more than simple habitat models</article-title>. <source>Ecol. Lett.</source> <volume>8</volume>, <fpage>993</fpage>&#x2013;<lpage>1009</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1461-0248.2005.00792.x</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hirzel</surname> <given-names>A. H.</given-names></name> <name><surname>Helfer</surname> <given-names>V.</given-names></name> <name><surname>Metral</surname> <given-names>F.</given-names></name></person-group> (<year>2001</year>). <article-title>Assessing habitat-suitability models with a virtual species</article-title>. <source>Ecol. Model.</source> <volume>145</volume>, <fpage>111</fpage>&#x2013;<lpage>121</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0304-3800(01)00396-9</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hortal</surname> <given-names>J.</given-names></name> <name><surname>Lobo</surname> <given-names>J. M.</given-names></name> <name><surname>Jim&#x00E9;nez-Valverde</surname> <given-names>A.</given-names></name></person-group> (<year>2007</year>). <article-title>Limitations of Biodiversity Databases: Case Study on Seed-Plant Diversity in Tenerife, Canary Islands</article-title>. <source>Conserv. Biol.</source> <volume>21</volume>, <fpage>853</fpage>&#x2013;<lpage>863</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1523-1739.2007.00686.x</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Iturbide</surname> <given-names>M.</given-names></name> <name><surname>Bedia</surname> <given-names>J.</given-names></name> <name><surname>Herrera</surname> <given-names>S.</given-names></name> <name><surname>del Hierro</surname> <given-names>O.</given-names></name> <name><surname>Pinto</surname> <given-names>M.</given-names></name> <name><surname>Guti&#x00E9;rrez</surname> <given-names>J. M.</given-names></name></person-group> (<year>2015</year>). <article-title>A framework for species distribution modelling with improved pseudo-absence generation</article-title>. <source>Ecol. Model.</source> <volume>312</volume>, <fpage>166</fpage>&#x2013;<lpage>174</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ecolmodel.2015.05.018</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jim&#x00E9;nez-Valverde</surname> <given-names>A.</given-names></name> <name><surname>Lobo</surname> <given-names>J. M.</given-names></name></person-group> (<year>2007</year>). <article-title>Threshold criteria for conversion of probability of species presence to either&#x2013;or presence&#x2013;absence</article-title>. <source>Acta Oecol.</source> <volume>31</volume>, <fpage>361</fpage>&#x2013;<lpage>369</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.actao.2007.02.001</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leroy</surname> <given-names>B.</given-names></name> <name><surname>Meynard</surname> <given-names>C. N.</given-names></name> <name><surname>Bellard</surname> <given-names>C.</given-names></name> <name><surname>Courchamp</surname> <given-names>F.</given-names></name></person-group> (<year>2016</year>). <article-title>virtualspecies, an R package to generate virtual species distributions</article-title>. <source>Ecography</source> <volume>39</volume>, <fpage>599</fpage>&#x2013;<lpage>607</lpage>. doi: <pub-id pub-id-type="doi">10.1111/ecog.01388</pub-id></citation></ref>
<ref id="ref001"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Q.</given-names></name> <name><surname>Xu</surname> <given-names>J.</given-names></name> <name><surname>Jiang</surname> <given-names>R.</given-names></name> <name><surname>Wong</surname> <given-names>W. H.</given-names></name></person-group> (<year>2021</year>). <article-title>Density estimation using deep generative neural networks</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>118</volume>:<fpage>e2101344118</fpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.2101344118</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lobo</surname> <given-names>J. M.</given-names></name></person-group> (<year>2008</year>). <article-title>More complex distribution models or more representative data?</article-title> <source>Biodiv. Inf.</source> <volume>5</volume>, <fpage>14</fpage>&#x2013;<lpage>19</lpage>. doi: <pub-id pub-id-type="doi">10.17161/bi.v5i0.40</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lobo</surname> <given-names>J. M.</given-names></name> <name><surname>Jim&#x00E9;nez-Valverde</surname> <given-names>A.</given-names></name> <name><surname>Hortal</surname> <given-names>J.</given-names></name></person-group> (<year>2010</year>). <article-title>The uncertain nature of absences and their importance in species distribution modelling</article-title>. <source>Ecography</source> <volume>33</volume>, <fpage>103</fpage>&#x2013;<lpage>114</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1600-0587.2009.06039.x</pub-id></citation></ref>
<ref id="ref26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lobo</surname> <given-names>J. M.</given-names></name> <name><surname>Jim&#x00E9;nez-Valverde</surname> <given-names>A.</given-names></name> <name><surname>Real</surname> <given-names>R.</given-names></name></person-group> (<year>2008</year>). <article-title>AUC: a misleading measure of the performance of predictive distribution models</article-title>. <source>Glob. Ecol. Biogeogr.</source> <volume>17</volume>, <fpage>145</fpage>&#x2013;<lpage>151</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1466-8238.2007.00358.x</pub-id></citation></ref>
<ref id="ref27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lobo</surname> <given-names>J. M.</given-names></name> <name><surname>Tognelli</surname> <given-names>M. F.</given-names></name></person-group> (<year>2011</year>). <article-title>Exploring the effects of quantity and location of pseudo-absences and sampling biases on the performance of distribution models with limited point occurrence data</article-title>. <source>J. Nat. Conserv.</source> <volume>19</volume>, <fpage>1</fpage>&#x2013;<lpage>7</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jnc.2010.03.002</pub-id></citation></ref>
<ref id="ref28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meynard</surname> <given-names>C. N.</given-names></name> <name><surname>Leroy</surname> <given-names>B.</given-names></name> <name><surname>Kaplan</surname> <given-names>D. M.</given-names></name></person-group> (<year>2019</year>). <article-title>Testing methods in species distribution modelling using virtual species: what have we learnt and what are we missing?</article-title> <source>Ecography</source> <volume>42</volume>, <fpage>2021</fpage>&#x2013;<lpage>2036</lpage>. doi: <pub-id pub-id-type="doi">10.1111/ecog.04385</pub-id></citation></ref>
<ref id="ref29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peterson</surname> <given-names>A. T.</given-names></name> <name><surname>Navarro-Sig&#x00FC;enza</surname> <given-names>A. G.</given-names></name> <name><surname>Gordillo</surname> <given-names>A.</given-names></name></person-group> (<year>2018</year>). <article-title>Assumption-versus data-based approaches to summarizing species&#x2019; ranges</article-title>. <source>Conserv. Biol.</source> <volume>32</volume>, <fpage>568</fpage>&#x2013;<lpage>575</lpage>. doi: <pub-id pub-id-type="doi">10.1111/cobi.12801</pub-id></citation></ref>
<ref id="ref30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Phillips</surname> <given-names>S. J.</given-names></name> <name><surname>Dud&#x00ED;k</surname> <given-names>M.</given-names></name> <name><surname>Elith</surname> <given-names>J.</given-names></name> <name><surname>Graham</surname> <given-names>C. H.</given-names></name> <name><surname>Lehmann</surname> <given-names>A.</given-names></name> <name><surname>Leathwick</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data</article-title>. <source>Ecol. Appl.</source> <volume>19</volume>, <fpage>181</fpage>&#x2013;<lpage>197</lpage>. doi: <pub-id pub-id-type="doi">10.1890/07-2153.1</pub-id></citation></ref>
<ref id="ref31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Py&#x0161;ek</surname> <given-names>P.</given-names></name> <name><surname>Jaro&#x0161;&#x00ED;k</surname> <given-names>V.</given-names></name> <name><surname>Hulme</surname> <given-names>P. E.</given-names></name> <name><surname>Pergl</surname> <given-names>J.</given-names></name> <name><surname>Hejda</surname> <given-names>M.</given-names></name> <name><surname>Schaffner</surname> <given-names>U.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>A global assessment of invasive plant impacts on resident species, communities and ecosystems: the interaction of impact measures, invading species&#x2019; traits and environment</article-title>. <source>Glob. Chang. Biol.</source> <volume>18</volume>, <fpage>1725</fpage>&#x2013;<lpage>1737</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1365-2486.2011.02636.x</pub-id></citation></ref>
<ref id="ref32"><citation citation-type="book"><person-group person-group-type="author"><collab id="coll1">R Core Team</collab></person-group> (<year>2022</year>). <source>R: A Language and Environment for Statistical Computing</source>. <publisher-loc>Vienna, Austria</publisher-loc>: <publisher-name>R Foundation for Statistical Computing</publisher-name>.</citation></ref>
<ref id="ref33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rocchini</surname> <given-names>D.</given-names></name> <name><surname>Hortal</surname> <given-names>J.</given-names></name> <name><surname>Lengyel</surname> <given-names>S.</given-names></name> <name><surname>Lobo</surname> <given-names>J. M.</given-names></name> <name><surname>Jim&#x00E9;nez-Valverde</surname> <given-names>A.</given-names></name> <name><surname>Ricotta</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Accounting for uncertainty when mapping species distributions: The need for maps of ignorance</article-title>. <source>Progr Phys Geograph</source> <volume>35</volume>, <fpage>211</fpage>&#x2013;<lpage>226</lpage>. doi: <pub-id pub-id-type="doi">10.1177/0309133311399491</pub-id></citation></ref>
<ref id="ref34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schoener</surname> <given-names>T. W.</given-names></name></person-group> (<year>1968</year>). <article-title>The Anolis Lizards of Bimini: Resource Partitioning in a Complex Fauna</article-title>. <source>Ecology</source> <volume>49</volume>, <fpage>704</fpage>&#x2013;<lpage>726</lpage>. doi: <pub-id pub-id-type="doi">10.2307/1935534</pub-id></citation></ref>
<ref id="ref35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Senay</surname> <given-names>S. D.</given-names></name> <name><surname>Worner</surname> <given-names>S. P.</given-names></name> <name><surname>Ikeda</surname> <given-names>T.</given-names></name></person-group> (<year>2013</year>). <article-title>Novel Three-Step Pseudo-Absence Selection Technique for Improved Species Distribution Modelling</article-title>. <source>PLoS One</source> <volume>8</volume>:<fpage>e71218</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pone.0071218</pub-id></citation></ref>
<ref id="ref36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smith</surname> <given-names>A. B.</given-names></name> <name><surname>Santos</surname> <given-names>M. J.</given-names></name> <name><surname>Koo</surname> <given-names>M. S.</given-names></name> <name><surname>Rowe</surname> <given-names>K. M. C.</given-names></name> <name><surname>Rowe</surname> <given-names>K. C.</given-names></name> <name><surname>Patton</surname> <given-names>J. L.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Evaluation of species distribution models by resampling of sites surveyed a century ago by Joseph Grinnell</article-title>. <source>Ecography</source> <volume>36</volume>, <fpage>1017</fpage>&#x2013;<lpage>1031</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1600-0587.2013.00107.x</pub-id></citation></ref>
<ref id="ref37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stockwell</surname> <given-names>D.</given-names></name></person-group> (<year>1999</year>). <article-title>The GARP modelling system: problems and solutions to automated spatial prediction</article-title>. <source>Int. J. Geogr. Inf. Sci.</source> <volume>13</volume>, <fpage>143</fpage>&#x2013;<lpage>158</lpage>. doi: <pub-id pub-id-type="doi">10.1080/136588199241391</pub-id></citation></ref>
<ref id="ref38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tessarolo</surname> <given-names>G.</given-names></name> <name><surname>Lobo</surname> <given-names>J. M.</given-names></name> <name><surname>Rangel</surname> <given-names>T. F.</given-names></name> <name><surname>Hortal</surname> <given-names>J.</given-names></name></person-group> (<year>2021</year>). <article-title>High uncertainty in the effects of data characteristics on the performance of species distribution models</article-title>. <source>Ecol. Indic.</source> <volume>121</volume>:<fpage>107147</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ecolind.2020.107147</pub-id></citation></ref>
<ref id="ref39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tocchio</surname> <given-names>L. J.</given-names></name> <name><surname>Gurgel-Gon&#x00E7;alves</surname> <given-names>R.</given-names></name> <name><surname>Escobar</surname> <given-names>L. E.</given-names></name> <name><surname>Peterson</surname> <given-names>A. T.</given-names></name></person-group> (<year>2015</year>). <article-title>Niche similarities among white-eared opossums (Mammalia, Didelphidae): Is ecological niche modelling relevant to setting species limits?</article-title> <source>Zool. Scr.</source> <volume>44</volume>, <fpage>1</fpage>&#x2013;<lpage>10</lpage>. doi: <pub-id pub-id-type="doi">10.1111/zsc.12082</pub-id></citation></ref>
<ref id="ref40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>VanDerWal</surname> <given-names>J.</given-names></name> <name><surname>Shoo</surname> <given-names>L. P.</given-names></name> <name><surname>Graham</surname> <given-names>C.</given-names></name> <name><surname>Williams</surname> <given-names>S. E.</given-names></name></person-group> (<year>2009</year>). <article-title>Selecting pseudo-absence data for presence-only distribution modeling: How far should you stray from what you know?</article-title> <source>Ecol. Model.</source> <volume>220</volume>, <fpage>589</fpage>&#x2013;<lpage>594</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ecolmodel.2008.11.010</pub-id></citation></ref>
<ref id="ref41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Velazco</surname> <given-names>S. J. E.</given-names></name> <name><surname>Rose</surname> <given-names>M. B.</given-names></name> <name><surname>de Andrade</surname> <given-names>A. F. A.</given-names></name> <name><surname>Minoli</surname> <given-names>I.</given-names></name> <name><surname>Franklin</surname> <given-names>J.</given-names></name></person-group> (<year>2022</year>). <article-title><sc>flexsdm</sc>: An <sc>r</sc> package for supporting a comprehensive and flexible species distribution modelling workflow</article-title>. <source>Methods Ecol. Evol.</source> <volume>13</volume>, <fpage>1661</fpage>&#x2013;<lpage>1669</lpage>. doi: <pub-id pub-id-type="doi">10.1111/2041-210X.13874</pub-id></citation></ref>
<ref id="ref42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Warren</surname> <given-names>D. L.</given-names></name> <name><surname>Matzke</surname> <given-names>N. J.</given-names></name> <name><surname>Cardillo</surname> <given-names>M.</given-names></name> <name><surname>Baumgartner</surname> <given-names>J. B.</given-names></name> <name><surname>Beaumont</surname> <given-names>L. J.</given-names></name> <name><surname>Turelli</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>ENMTools 1.0: an R package for comparative ecological biogeography</article-title>. <source>Ecography</source> <volume>44</volume>, <fpage>504</fpage>&#x2013;<lpage>511</lpage>. doi: <pub-id pub-id-type="doi">10.1111/ecog.05485</pub-id></citation></ref>
<ref id="ref43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wisz</surname> <given-names>M. S.</given-names></name> <name><surname>Guisan</surname> <given-names>A.</given-names></name></person-group> (<year>2009</year>). <article-title>Do pseudo-absence selection strategies influence species distribution models and their predictions? An information-theoretic approach based on simulated data</article-title>. <source>BMC Ecol.</source> <volume>9</volume>:<fpage>8</fpage>. doi: <pub-id pub-id-type="doi">10.1186/1472-6785-9-8</pub-id></citation></ref>
<ref id="ref44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wisz</surname> <given-names>M. S.</given-names></name> <name><surname>Hijmans</surname> <given-names>R. J.</given-names></name> <name><surname>Li</surname> <given-names>J.</given-names></name> <name><surname>Peterson</surname> <given-names>A. T.</given-names></name> <name><surname>Graham</surname> <given-names>C. H.</given-names></name> <name><surname>Guisan</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2008</year>). <article-title>Effects of sample size on the performance of species distribution models</article-title>. <source>Divers. Distrib.</source> <volume>14</volume>, <fpage>763</fpage>&#x2013;<lpage>773</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1472-4642.2008.00482.x</pub-id></citation></ref>
<ref id="ref45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xian</surname> <given-names>X.</given-names></name> <name><surname>Zhao</surname> <given-names>H.</given-names></name> <name><surname>Wang</surname> <given-names>R.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>B.</given-names></name> <name><surname>Liu</surname> <given-names>W.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>Evidence of the niche expansion of crofton weed following invasion in China</article-title>. <source>Ecol. Evol.</source> <volume>13</volume>:<fpage>e9708</fpage>. doi: <pub-id pub-id-type="doi">10.1002/ece3.9708</pub-id></citation></ref>
<ref id="ref46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Q.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name></person-group> (<year>2023a</year>). <article-title>A geographical similarity-based sampling method of non-fire point data for spatial prediction of forest fires</article-title>. <source>Forest Ecosyst</source> <volume>10</volume>:<fpage>100104</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.fecs.2023.100104</pub-id></citation></ref>
<ref id="ref47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Q.</given-names></name> <name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Liang</surname> <given-names>H.</given-names></name></person-group> (<year>2021</year>). <article-title>Simulation of Land-Use Changes Using the Partitioned ANN-CA Model and Considering the Influence of Land-Use Change Frequency</article-title>. <source>IJGI</source> <volume>10</volume>:<fpage>346</fpage>. doi: <pub-id pub-id-type="doi">10.3390/ijgi10050346</pub-id></citation></ref>
<ref id="ref48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Q.</given-names></name> <name><surname>Zhu</surname> <given-names>A.-X.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name></person-group> (<year>2023b</year>). <article-title>Land-use change modeling with cellular automata using land natural evolution unit</article-title>. <source>Catena</source> <volume>224</volume>:<fpage>106998</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.catena.2023.106998</pub-id></citation></ref>
<ref id="ref49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zaniewski</surname> <given-names>A. E.</given-names></name> <name><surname>Lehmann</surname> <given-names>A.</given-names></name> <name><surname>Overton</surname> <given-names>J. M.</given-names></name></person-group> (<year>2002</year>). <article-title>Predicting species spatial distributions using presence-only data: a case study of native New Zealand ferns</article-title>. <source>Ecol. Model.</source> <volume>157</volume>, <fpage>261</fpage>&#x2013;<lpage>280</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0304-3800(02)00199-0</pub-id></citation></ref>
<ref id="ref50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Ma</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Liu</surname> <given-names>W.</given-names></name> <name><surname>Cao</surname> <given-names>Z.</given-names></name> <name><surname>Zhang</surname> <given-names>Q.</given-names></name></person-group> (<year>2007</year>). <article-title>Patterns of <italic>Eupatorium adenophorum</italic> along roadsides in Lincang region, Yunnan province, China</article-title>. <source>Ecol Environ Sci</source>. <volume>16</volume>, <fpage>516</fpage>&#x2013;<lpage>522</lpage>. doi: <pub-id pub-id-type="doi">10.16258/j.cnki.1674-5906.2007.02.050</pub-id></citation></ref>
<ref id="ref51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>A. X.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Du</surname> <given-names>F.</given-names></name> <name><surname>Zhang</surname> <given-names>S. J.</given-names></name> <name><surname>Qin</surname> <given-names>C. Z.</given-names></name> <name><surname>Burt</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Predictive soil mapping with limited sample data: PSM using limited samples</article-title>. <source>Eur. J. Soil Sci.</source> <volume>66</volume>, <fpage>535</fpage>&#x2013;<lpage>547</lpage>. doi: <pub-id pub-id-type="doi">10.1111/ejss.12244</pub-id></citation></ref>
<ref id="ref52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>A.</given-names></name> <name><surname>Lu</surname> <given-names>G.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Qin</surname> <given-names>C.</given-names></name> <name><surname>Zhou</surname> <given-names>C.</given-names></name></person-group> (<year>2018</year>). <article-title>Spatial prediction based on Third Law of Geography</article-title>. <source>Ann. GIS</source> <volume>24</volume>, <fpage>225</fpage>&#x2013;<lpage>240</lpage>. doi: <pub-id pub-id-type="doi">10.1080/19475683.2018.1534890</pub-id></citation></ref>
<ref id="ref53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>A.-X.</given-names></name> <name><surname>Miao</surname> <given-names>Y.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Bai</surname> <given-names>S.</given-names></name> <name><surname>Zeng</surname> <given-names>C.</given-names></name> <name><surname>Ma</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>A similarity-based approach to sampling absence data for landslide susceptibility mapping using data-driven methods</article-title>. <source>Catena</source> <volume>183</volume>:<fpage>104188</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.catena.2019.104188</pub-id></citation></ref>
</ref-list>
</back>
</article>
