<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Energy Res.</journal-id>
<journal-title>Frontiers in Energy Research</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Energy Res.</abbrev-journal-title>
<issn pub-type="epub">2296-598X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">1427587</article-id>
<article-id pub-id-type="doi">10.3389/fenrg.2024.1427587</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Energy Research</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>MRGS-LSTM: a novel multi-site wind speed prediction approach with spatio-temporal correlation</article-title>
<alt-title alt-title-type="left-running-head">Zhou and Fan</alt-title>
<alt-title alt-title-type="right-running-head">
<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fenrg.2024.1427587">10.3389/fenrg.2024.1427587</ext-link>
</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Zhou</surname>
<given-names>Yueguang</given-names>
</name>
<uri xlink:href="https://loop.frontiersin.org/people/2729836/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/Writing - review &#x26; editing/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/visualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/methodology/"/>
<role content-type="https://credit.niso.org/contributor-roles/investigation/"/>
<role content-type="https://credit.niso.org/contributor-roles/data-curation/"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Fan</surname>
<given-names>Xiuxiang</given-names>
</name>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2822814/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/methodology/"/>
<role content-type="https://credit.niso.org/contributor-roles/funding-acquisition/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
</contrib>
</contrib-group>
<aff>
<institution>School of Electrical and Electronic Engineering</institution>, <institution>Hubei University of Technology</institution>, <addr-line>Wuhan</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/533177/overview">Takvor H Soukissian</ext-link>, Hellenic Centre for Marine Research (HCMR), Greece</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1590007/overview">Kiran Bhaganagar</ext-link>, University of Texas at San Antonio, United States</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1942808/overview">Yan Jiang</ext-link>, Southwest University, China</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Xiuxiang Fan, <email>fanxxhbut302@163.com</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>29</day>
<month>08</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>12</volume>
<elocation-id>1427587</elocation-id>
<history>
<date date-type="received">
<day>04</day>
<month>05</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>08</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2024 Zhou and Fan.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Zhou and Fan</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>The wind energy industry is witnessing a new era of extraordinary growth as the demand for renewable energy continues to grow. However, accurately predicting wind speed remains a significant challenge due to its high fluctuation and randomness. These difficulties hinder effective wind farm management and integration into the power grid. To address this issue, we propose the MRGS-LSTM model to improve the accuracy and reliability of wind speed prediction results, which considers the complex spatio-temporal correlations between features at multiple sites. First, mRMR-RF filters the input multidimensional meteorological variables and computes the feature subset with minimum information redundancy. Second, the feature map topology is constructed by quantifying the spatial distance distribution of the multiple sites and the maximum mutual information coefficient among the features. On this basis, the GraphSAGE framework is used to sample and aggregate the feature information of neighboring sites to extract spatial feature vectors. Then, the spatial feature vectors are input into the long short-term memory (LSTM) model after sliding window sampling. The LSTM model learns the temporal features of wind speed data to output the predicted results of the spatio-temporal correlation at each site. Finally, through the simulation experiments based on real historical data from the Roscoe Wind Farm in Texas, United States, we prove that our model MRGS-LSTM improves the performance of MAE by 15.43%&#x2013;27.97% and RMSE by 12.57%&#x2013;25.40% compared with other models of the same type. The experimental results verify the validity and superiority of our proposed model and provide a more reliable basis for the scheduling and optimization of wind farms.</p>
</abstract>
<kwd-group>
<kwd>multi-site wind speed prediction</kwd>
<kwd>deep learning</kwd>
<kwd>graphsage</kwd>
<kwd>long and short-term memory</kwd>
<kwd>spatio-temporal correlation</kwd>
</kwd-group>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Wind Energy</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>As a green, renewable and clean source of energy, wind energy is crucial for mitigating climate change and building a sustainable energy system. Currently, wind power has become an important part of global renewable energy (<ext-link ext-link-type="uri" xlink:href="https://gwec.net/wp-content/uploads/2023/04/GWEC-2023_interactive.pdf">https://gwec.net/wp-content/uploads/2023/04/GWEC-2023_interactive.pdf</ext-link>, 2023). However, due to the high randomness and volatility of wind speed, large-scale grid-connected wind power can pose a serious threat to the smooth operation of the power systems (<xref ref-type="bibr" rid="B30">Zhang et al., 2019</xref>; <xref ref-type="bibr" rid="B14">Li, 2022</xref>). Predicting wind speed can help wind farms to adjust their scheduling plans in real time and provide a reference for the operation and maintenance time of wind turbines (<xref ref-type="bibr" rid="B25">Wu et al., 2021</xref>). Therefore, improving the accuracy of wind speed prediction will reduce the cost of wind energy utilization and enhance the efficiency of wind power access (<xref ref-type="bibr" rid="B11">Khosravi et al., 2018</xref>; <xref ref-type="bibr" rid="B29">Zhang et al., 2020</xref>).</p>
<p>Generally, wind speed prediction methods are divided into physical, statistical and machine learning methods. Physical methods use numerical weather prediction models and probability density models to correct errors. Statistical methods mainly use differential autoregressive moving average models to fit historical data (<xref ref-type="bibr" rid="B2">Cadenas et al., 2016</xref>), but both of these methods cannot well capture the dynamic changes and nonlinear features of wind speed. Machine learning methods have been widely used in the field of wind speed prediction for single turbines because of their powerful feature extraction capabilities. Time-series prediction models such as LSTM (<xref ref-type="bibr" rid="B17">Liu et al., 2018</xref>; <xref ref-type="bibr" rid="B15">Li et al., 2022</xref>; <xref ref-type="bibr" rid="B24">Wang et al., 2023</xref>), gated recurrent unit (GRU) (<xref ref-type="bibr" rid="B13">Li et al., 2020</xref>; <xref ref-type="bibr" rid="B23">Wang and Gui, 2022</xref>) etc. have strong nonlinear fitting effects and strong learning ability, which in turn are more popular in wind speed prediction modeling. For example, <xref ref-type="bibr" rid="B25">Wu et al. (2021)</xref> proposed an LSTM network model that combines the maximum information coefficient (MIC) and multi-task learning, and verified that the machine learning model outperforms physical and statistical methods. <xref ref-type="bibr" rid="B3">Chen et al. (2021)</xref> leveraged bidirectional GRU to improve the accuracy and generalization ability of the model by extracting the temporal feature information of wind power and meteorological data. From a macroscale perspective, wind speed exhibits certain regularities and periodicities annually, quarterly, and even monthly. In addition, studies (<xref ref-type="bibr" rid="B20">Nielson and Bhaganagar., 2019</xref>; <xref ref-type="bibr" rid="B21">Nielson et al., 2020</xref>) have shown that atmospheric stability plays a key role in wind energy production. Considering atmospheric input characteristics, such as wind shear and turbulence intensity, can significantly improve the accuracy of wind turbine power predictions. This highlights the importance of including atmospheric variables in wind speed prediction models to improve their performance and reliability.</p>
<p>Wind farms are mostly constructed in clusters, and single turbine wind speed prediction methods are more difficult to adapt to wind speed prediction scenarios in wind farms. Due to the high similarity of the environment and meteorological conditions in which the wind farms are located, there is also a correlation between the wind speed variations among the sites within the wind farms, and this spatial correlation can be utilized to improve the accuracy of wind speed prediction. <xref ref-type="bibr" rid="B31">Zhu et al. (2021)</xref> used convolutional neural network (CNN) to extract the field-wide features that affect the long-term wind speed distribution and considered the wind speed correlation among units. However, the single feature limits the ability of the model to capture complex interactions in the multi-turbine feature expansion. <xref ref-type="bibr" rid="B22">Wang et al. (2022)</xref> proposed an ultra-short-term wind farm cluster power forecasting method based on dynamic spatio-temporal correlation. <xref ref-type="bibr" rid="B27">Yu et al. (2022)</xref> established a CNN-LSTM-AM dynamic integration model. But CNN could not accurately express the spatial features of wind field distributed by non-grid structure, which resulted in low accuracy of wind speed prediction. <xref ref-type="bibr" rid="B1">Bai et al. (2024)</xref> proposed a wind speed prediction model based on the improved variational mode decomposition and Seq2Seq network, which fully learns the implicit correlation features of multidimensional time series data. Meanwhile, the Seq2Seq model has a complex encoder and decoder structure, which makes the model consume a large number of computational resources when the wind speed fluctuates greatly. On the whole, the above methods fail to fully utilize the spatial correlation of multi-variables among sites within a wind farm, and the spatial relationship of multi-variables is insufficiently portrayed.</p>
<p>In recent years, there has been an increasing trend in researches on modeling spatial features using graph models to characterize the spatial correlation of multiple wind farm features. The spatial relationships between nodes are characterized by graph models, which are used for feature transfer and aggregation to better extract spatial features via graph neural network (GNN). Typical graph neural networks are GCN (<xref ref-type="bibr" rid="B19">Liu and Ware, 2022</xref>), GAT (<xref ref-type="bibr" rid="B16">Liu et al., 2023a</xref>), etc., which have been successfully applied to extract spatial features from graph models in various fields, such as traffic prediction (<xref ref-type="bibr" rid="B26">Yu et al., 2018</xref>) and airport delay prediction (<xref ref-type="bibr" rid="B28">Zeng et al., 2021</xref>). <xref ref-type="bibr" rid="B5">Geng et al. (2021)</xref> proposed a graph optimization neural network for multi-node offshore wind speed prediction, which captures spatial dependencies and generates high-dimensional spatial features through GCN and channel attention mechanisms. <xref ref-type="bibr" rid="B10">Khodayar and Wang (2019)</xref> combined rough set networks and GCN to extract spatial features of wind farms. <xref ref-type="bibr" rid="B18">Liu et al. (2023b)</xref> proposed an adaptive graph learning convolutional network (AGLCN) that can automatically infer hidden associations and achieve better results in extracting spatial features of offshore wind farms. <xref ref-type="bibr" rid="B7">He et al. (2022)</xref> used GAT to extract multi-site wind features for collaborative wind speed prediction to improve the accuracy of model. Therefore, wind speed prediction of multiple sites requires comprehensive consideration of both time-series data from each individual site as well as interactions between their respective distributions across space. Reasonably and efficiently characterizing and utilizing this spatial and temporal correlations of multiple sites are the key to improving the accuracy of wind speed prediction model. Many studies either focus on single turbine wind prediction or fail to adequately capture the complex spatio-temporal interactions within sites. Moreover, traditional models often overlook the irregular distribution of wind turbines, resulting in poor generality in practical applications.</p>
<p>The main idea of this study is to propose a novel wind speed prediction model, called MRGS-LSTM, which integrates the spatio-temporal correlation of irregular multiple sites distributions. The novelty and contributions of this study are described below.<list list-type="simple">
<list-item>
<p>1. We randomly select 20 irregularly distributed wind turbines to validate our model&#x2019;s robustness in handling datasets from irregular wind site layouts.</p>
</list-item>
<list-item>
<p>2. We develop a novel mRMR-RF method to assess the importance of various features and select the most influential subset relevant to wind speed prediction.</p>
</list-item>
<list-item>
<p>3. Our innovative MRGS-LSTM model extracts spatio-temporal features from multiple sites. Experiments show that central sites achieve better prediction accuracy by incorporating wind speed data from neighboring sites.</p>
</list-item>
<list-item>
<p>4. We selected other mainstream graph neural network models and temporal feature extraction models to construct 5 comparison models. The experimental results show that our model has excellent performance in MAE and RMSE.</p>
</list-item>
</list>
</p>
</sec>
<sec sec-type="methods" id="s2">
<title>2 Methodology</title>
<sec id="s2-1">
<title>2.1 Graph modeling</title>
<p>Assuming that a graph, <inline-formula id="inf1">
<mml:math id="m1">
<mml:mrow>
<mml:mi>G</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf2">
<mml:math id="m2">
<mml:mrow>
<mml:mi>V</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> represents the set of <inline-formula id="inf3">
<mml:math id="m3">
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> nodes, and <inline-formula id="inf4">
<mml:math id="m4">
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is the set of edges that show the node-to-node connectivity. Node <inline-formula id="inf5">
<mml:math id="m5">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf6">
<mml:math id="m6">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are connected in the graph only when there exists a strong correlation between them, where <inline-formula id="inf7">
<mml:math id="m7">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi>V</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="}" separators="&#x7c;">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. The correlation between <inline-formula id="inf8">
<mml:math id="m8">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf9">
<mml:math id="m9">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is defined as <inline-formula id="inf10">
<mml:math id="m10">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. <inline-formula id="inf11">
<mml:math id="m11">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="}" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>N</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf12">
<mml:math id="m12">
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="}" separators="&#x7c;">
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>i</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>i</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the feature set with number <italic>n</italic> of node <inline-formula id="inf13">
<mml:math id="m13">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, denotes the sample set of all nodes. <inline-formula id="inf14">
<mml:math id="m14">
<mml:mrow>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is a <inline-formula id="inf15">
<mml:math id="m15">
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> adjacency matrix. To enhance computational efficiency of graph neural networks and prevent interference from weakly correlated nodes, <inline-formula id="inf16">
<mml:math id="m16">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is defined as <xref ref-type="disp-formula" rid="e1">Equation 1</xref>.<disp-formula id="e1">
<mml:math id="m17">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">W</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="" separators="&#x7c;">
<mml:mrow>
<mml:mtable columnalign="center">
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:mn mathvariant="bold">0.8</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mn mathvariant="bold">0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">o</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mi mathvariant="bold-italic">w</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">s</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>where <inline-formula id="inf17">
<mml:math id="m18">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. <xref ref-type="fig" rid="F1">Figure 1</xref> shows an example of adjacency matrix construction, which describes the connection relationships and correlations between nodes.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Adjacency matrix construction.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g001.tif"/>
</fig>
<sec id="s2-1-1">
<title>2.1.1 GraphSAGE</title>
<p>GraphSAGE network is an inductive learning framework for graph representation. By leveraging the attribute information of the nodes, this network can efficiently generate vector representations of unknown nodes or new graphs. Computational complexities are reduced by sampling neighboring nodes for aggregated representation. GraphSAGE can also capture diverse graph structures and feature information to fit various graph data and tasks.</p>
<p>GraphSAGE randomly samples the target nodes with K layers of neighbor nodes. The preset number of neighbor nodes to be sampled in each layer is denoted as <inline-formula id="inf18">
<mml:math id="m19">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>K</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>. During sampling, if the node number of the <inline-formula id="inf19">
<mml:math id="m20">
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>-th layer is less than <inline-formula id="inf20">
<mml:math id="m21">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, the sampling scheme without replacement is adopted, otherwise the sampling scheme with replacement is utilized. Based on the study by <xref ref-type="bibr" rid="B6">Hamilton et al. (2018)</xref>, we select K &#x3d; 2 for GraphSAGE to achieve excellent performance. For example, if <inline-formula id="inf21">
<mml:math id="m22">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 3 and <inline-formula id="inf22">
<mml:math id="m23">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 7, the sampling and aggregation process is shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. The blue dots and the green dots represent the first-order and second-order neighbor node sampled for the target node, respectively.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Illustration of GraphSAGE.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g002.tif"/>
</fig>
<p>GraphSAGE aggregates the information of neighboring nodes at each layer through the aggregation function AGGREGATE. Afterwards, the information of the target node is continuously updated by <xref ref-type="disp-formula" rid="e2">Equations 2</xref>, <xref ref-type="disp-formula" rid="e3">3</xref>.<disp-formula id="e2">
<mml:math id="m24">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">N</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">V</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msubsup>
<mml:mo>&#x2190;</mml:mo>
<mml:mi mathvariant="bold-italic">A</mml:mi>
<mml:mi mathvariant="bold-italic">G</mml:mi>
<mml:mi mathvariant="bold-italic">G</mml:mi>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mi mathvariant="bold-italic">E</mml:mi>
<mml:mi mathvariant="bold-italic">G</mml:mi>
<mml:mi mathvariant="bold-italic">A</mml:mi>
<mml:mi mathvariant="bold-italic">T</mml:mi>
<mml:mi mathvariant="bold-italic">E</mml:mi>
</mml:mrow>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mfenced open="{" close="}" separators="&#x7c;">
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#x2200;</mml:mo>
<mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="bold-italic">N</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">V</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
</p>
<p>
<disp-formula id="e3">
<mml:math id="m25">
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msubsup>
<mml:mo>&#x2190;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3c3;</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">Z</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msup>
<mml:mo>&#x00B7;</mml:mo>
<mml:mi mathvariant="bold-italic">C</mml:mi>
<mml:mi mathvariant="bold-italic">O</mml:mi>
<mml:mi mathvariant="bold-italic">N</mml:mi>
<mml:mi mathvariant="bold-italic">C</mml:mi>
<mml:mi mathvariant="bold-italic">A</mml:mi>
<mml:mi mathvariant="bold-italic">T</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">N</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">V</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>where <inline-formula id="inf23">
<mml:math id="m26">
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi>V</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> denotes the set of neighbor nodes of node <italic>V</italic>, <inline-formula id="inf24">
<mml:math id="m27">
<mml:mrow>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mi>V</mml:mi>
<mml:mi>k</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> denotes the embedding vector of node <inline-formula id="inf25">
<mml:math id="m28">
<mml:mrow>
<mml:mi>V</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> in the <inline-formula id="inf26">
<mml:math id="m29">
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>-th epoch, the <italic>CONCAT</italic> function linearly superimposes each vector using residual crosstabs, <inline-formula id="inf27">
<mml:math id="m30">
<mml:mrow>
<mml:msup>
<mml:mi>Z</mml:mi>
<mml:mi>k</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the weight matrix that GraphSAGE needs to learn, and <inline-formula id="inf28">
<mml:math id="m31">
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is the nonlinear activation function ReLU, which can effectively solve the vanishing gradient problem and improve the convergence speed of the network. To increase the convergence of the model, vector representation can be normalized by <xref ref-type="disp-formula" rid="e4">Equation 4</xref>.<disp-formula id="e4">
<mml:math id="m32">
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msubsup>
<mml:mo>&#x2190;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="&#x2016;" close="&#x2016;" separators="&#x7c;">
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn mathvariant="bold">2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>
</p>
</sec>
</sec>
<sec id="s2-2">
<title>2.2 LSTM</title>
<p>Memory cell units are incorporated in the hidden layers for LSTM to realize the selective memory and forgetting of information and retain a certain length of historical information. Notably, LSTM excels in capturing long-term dependency in data and solves the problems of gradient vanishing and gradient explosion. <xref ref-type="fig" rid="F3">Figure 3</xref> illustrates the LSTM network structure, which contains multiple memory cells. Furthermore, each cell is extended with an input gate, a forget gate and an output gate.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>LSTM network structure.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g003.tif"/>
</fig>
<p>Input gate: The input gate <inline-formula id="inf29">
<mml:math id="m33">
<mml:mrow>
<mml:msub>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> regulates the inclusion of new inputs according to the hidden state of the previous period and the inputs of the current period. Additionally, the cell state <inline-formula id="inf30">
<mml:math id="m34">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>c</mml:mi>
<mml:mo>&#x223c;</mml:mo>
</mml:mover>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> selectively stores the input information. The calculation of the input gate and the cell state are as follows by <xref ref-type="disp-formula" rid="e5">Equations 5</xref>, <xref ref-type="disp-formula" rid="e6">6</xref>.<disp-formula id="e5">
<mml:math id="m35">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3c3;</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3c9;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x00B7;</mml:mo>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>
<disp-formula id="e6">
<mml:math id="m36">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mo>&#x223c;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mi mathvariant="bold-italic">tanh</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3c9;</mml:mi>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:msub>
<mml:mo>&#x00B7;</mml:mo>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(6)</label>
</disp-formula>where <inline-formula id="inf31">
<mml:math id="m37">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> denotes the hidden state of the previous period, <inline-formula id="inf32">
<mml:math id="m38">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> denotes the inputs of the current period, <inline-formula id="inf33">
<mml:math id="m39">
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is the weight coefficients, and b is the bias values, <inline-formula id="inf34">
<mml:math id="m40">
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> and tanh denote the sigmoid activation function and hyperbolic tangent function respectively.</p>
<p>Forget gate: The forget gate <inline-formula id="inf35">
<mml:math id="m41">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> determines the selective forgetting or retention of the previous output information. The forget gate is calculated according to the following <xref ref-type="disp-formula" rid="e7">Equation 7</xref>.<disp-formula id="e7">
<mml:math id="m42">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mo>&#x223c;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mi mathvariant="bold-italic">tanh</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3c9;</mml:mi>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:msub>
<mml:mo>&#x00B7;</mml:mo>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(7)</label>
</disp-formula>
</p>
<p>Output gate: The output gate <inline-formula id="inf36">
<mml:math id="m43">
<mml:mrow>
<mml:msub>
<mml:mi>o</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> obtains the output of the LSTM network according to the cell state <inline-formula id="inf37">
<mml:math id="m44">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. The calculation of the output gate can be expressed as <xref ref-type="disp-formula" rid="e1">Equations 8</xref>, <xref ref-type="disp-formula" rid="e9">9</xref>.<disp-formula id="e8">
<mml:math id="m45">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">o</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3c3;</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3c9;</mml:mi>
<mml:mi mathvariant="bold-italic">o</mml:mi>
</mml:msub>
<mml:mo>&#x00B7;</mml:mo>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mi mathvariant="bold-italic">o</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(8)</label>
</disp-formula>
<disp-formula id="e9">
<mml:math id="m46">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">o</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x2299;</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mi mathvariant="bold-italic">tanh</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(9)</label>
</disp-formula>where <inline-formula id="inf38">
<mml:math id="m47">
<mml:mrow>
<mml:mo>&#x2299;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> denotes the Hadamard product. Based on the aforementioned calculation results, the cell state is updated by <xref ref-type="disp-formula" rid="e10">Equation 10</xref>.<disp-formula id="e10">
<mml:math id="m48">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">f</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x2299;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>&#x2299;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mo>&#x223c;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:msub>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(10)</label>
</disp-formula>
</p>
</sec>
</sec>
<sec id="s3">
<title>3 Spatial feature extraction</title>
<sec id="s3-1">
<title>3.1 mRMR</title>
<p>Assuming that the original dataset of a wind farm with <italic>N</italic> sites is <inline-formula id="inf39">
<mml:math id="m49">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="}" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>N</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf40">
<mml:math id="m50">
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>j</mml:mi>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="}" separators="&#x7c;">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>j</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>j</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>&#x3c7;</mml:mi>
</mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>j</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> denotes the feature set of site <italic>j</italic>, <inline-formula id="inf41">
<mml:math id="m51">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="&#x7c;">
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi>M</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the <italic>i</italic>-th feature sequence of site <italic>j</italic>, the feature vector <inline-formula id="inf42">
<mml:math id="m52">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b3;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="&#x7c;">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>1</mml:mn>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>N</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is constructed by concatenating the <italic>i</italic>-th feature values of all sites. The minimum redundancy maximum relevance (mRMR) algorithm assesses the significance of features by quantifying the correlation between each feature and the target variable, as shown in <xref ref-type="disp-formula" rid="e11">Equation 11</xref>.<disp-formula id="e11">
<mml:math id="m53">
<mml:mrow>
<mml:mi mathvariant="bold-italic">I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x222c;</mml:mo>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">log</mml:mi>
<mml:mn mathvariant="bold">2</mml:mn>
</mml:msub>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mi mathvariant="bold-italic">d</mml:mi>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mi mathvariant="bold-italic">d</mml:mi>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(11)</label>
</disp-formula>where <inline-formula id="inf43">
<mml:math id="m54">
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mo>&#x00B7;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the probability distribution function. The mRMR algorithm consists of two parts:</p>
<p>Max Relevance: Define <italic>S</italic> as the feature subset of <inline-formula id="inf44">
<mml:math id="m55">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b3;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. Average the summation of the relevance between each feature and the target, as shown in <xref ref-type="disp-formula" rid="e12">Equation 12</xref>.<disp-formula id="e12">
<mml:math id="m56">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">D</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn mathvariant="bold">1</mml:mn>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(12)</label>
</disp-formula>where <inline-formula id="inf45">
<mml:math id="m57">
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:math>
</inline-formula> denotes the dimension of the feature subset and <italic>c</italic> denotes the target.</p>
<p>Minimum Redundancy: Calculate the redundancy between the features, as shown in <xref ref-type="disp-formula" rid="e13">Equation 13</xref>.<disp-formula id="e13">
<mml:math id="m58">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn mathvariant="bold">2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(13)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-2">
<title>3.2 mRMR-RF</title>
<p>Random forest (RF) is an integrated machine learning algorithm. Multiple decision tree model, which is the supervised learning algorithm, improves the accuracy and stability of feature selection. By using mutual information as metric and considering that both feature relevance and redundancy, mRMR can be embedded into RF to select feature subsets with minimal information redundancy which can effectively eliminate irrelevant or repetitive features. As illustrated in <xref ref-type="fig" rid="F4">Figure 4</xref>, the specific processes are as follows:</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>mRMR-RF feature selection.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g004.tif"/>
</fig>
<p>Initialize the original dataset: The original dataset is divided into features and target.</p>
<p>Data normalization: The constructed dataset is normalized to eliminate the influence of dimensions by <xref ref-type="disp-formula" rid="e14">Equation 14</xref>.<disp-formula id="e14">
<mml:math id="m59">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mi mathvariant="bold-italic">min</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mi mathvariant="bold-italic">max</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3b3;</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mi mathvariant="bold-italic">min</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(14)</label>
</disp-formula>where <inline-formula id="inf46">
<mml:math id="m60">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>&#x3b3;</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the normalized data, <inline-formula id="inf47">
<mml:math id="m61">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b3;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mi mathvariant="italic">min</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf48">
<mml:math id="m62">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b3;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mi mathvariant="italic">max</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are the maximum and minimum values of the feature data, respectively.</p>
<p>Bootstrap Sampling: Sample data are randomly drawn with replacement from the feature to build <italic>K</italic> decision trees. If the feature has <italic>M</italic> points, then the unsampled probability of each record is <inline-formula id="inf49">
<mml:math id="m63">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>M</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>. When <inline-formula id="inf50">
<mml:math id="m64">
<mml:mrow>
<mml:munder>
<mml:mi>lim</mml:mi>
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo>&#x2192;</mml:mo>
</mml:mrow>
<mml:mi>&#x221e;</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mi>P</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, 36.8% of features are not included in the training set. These data are also called Out-Of-Bag data (OOB data).</p>
<p>Getting the candidate feature set: The error values <inline-formula id="inf51">
<mml:math id="m65">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>K</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> for each decision tree are calculated from the OOB data<inline-formula id="inf52">
<mml:math id="m66">
<mml:mrow>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> RF shuffles the feature <inline-formula id="inf53">
<mml:math id="m67">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b3;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> in the OOB data and recalculates the error values <inline-formula id="inf54">
<mml:math id="m68">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2a;</mml:mo>
</mml:msubsup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. The equation for calculating the feature importance is shown as <xref ref-type="disp-formula" rid="e15">Equation 15</xref>.<disp-formula id="e15">
<mml:math id="m69">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">m</mml:mi>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mi mathvariant="bold-italic">o</mml:mi>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
</mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">K</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">K</mml:mi>
</mml:msubsup>
</mml:mstyle>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">E</mml:mi>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mi mathvariant="bold-italic">r</mml:mi>
</mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">E</mml:mi>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mi mathvariant="bold-italic">r</mml:mi>
</mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x2a;</mml:mo>
</mml:msubsup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(15)</label>
</disp-formula>
</p>
<p>Afterwards, by ranking the feature importance, the features with high importance are filtered by the determined threshold <inline-formula id="inf55">
<mml:math id="m70">
<mml:mrow>
<mml:mi>&#x3b5;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> as a candidate feature set.</p>
<p>Partitioning feature subsets: All possible feature subsets <italic>s</italic>(<italic>u</italic>, <italic>v</italic>) are exhaustively enumerated from the candidate feature set, where <inline-formula id="inf56">
<mml:math id="m71">
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is the number of features in the subset, <inline-formula id="inf57">
<mml:math id="m72">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:msubsup>
<mml:mi>C</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>u</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> is the number of subsets.</p>
<p>Calculating the mRMR score of feature subsets: Calculate the score of the feature subset <italic>s</italic>(<italic>u, v</italic>) via the following <xref ref-type="disp-formula" rid="e16">Equation 16</xref>:<disp-formula id="e16">
<mml:math id="m73">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">s</mml:mi>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mi mathvariant="bold-italic">o</mml:mi>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">v</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">u</mml:mi>
</mml:msubsup>
</mml:mstyle>
<mml:msub>
<mml:mi mathvariant="bold-italic">D</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">u</mml:mi>
</mml:msubsup>
</mml:mstyle>
<mml:msub>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(16)</label>
</disp-formula>
</p>
<p>The optimal feature subset is selected by ranking each feature subset score.</p>
<p>Feature fusion: The fused feature <inline-formula id="inf58">
<mml:math id="m74">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> of site <italic>j</italic> is obtained by averaging the selected optimal feature subset. <inline-formula id="inf59">
<mml:math id="m75">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="}" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo>,</mml:mo>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> denotes the fused feature of all sites. The process of feature fusion is shown as <xref ref-type="disp-formula" rid="e17">Equation 17</xref>.<disp-formula id="e17">
<mml:math id="m76">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">u</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">u</mml:mi>
</mml:msubsup>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">&#x3c7;</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msubsup>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(17)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-3">
<title>3.3 Topology construction of spatial features</title>
<p>The graph structured data represent the multi-site features and their correlations. <italic>N</italic> sites and their feature correlation are topologically constructed as <xref ref-type="disp-formula" rid="e18">Equation 18</xref>.<disp-formula id="e18">
<mml:math id="m77">
<mml:mrow>
<mml:mi mathvariant="bold-italic">G</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">E</mml:mi>
<mml:mo>,</mml:mo>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">W</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(18)</label>
</disp-formula>where <inline-formula id="inf60">
<mml:math id="m78">
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="}" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>N</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> denotes the set of sites, <italic>E</italic> denotes the set of edges formed between sites, <inline-formula id="inf61">
<mml:math id="m79">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> represents the fused features of all sites, and <inline-formula id="inf62">
<mml:math id="m80">
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the weighted adjacency matrix, which is constructed by concurrently considering the spatial distance correlation and time series correlation among multi-site. The spatial weighted adjacency matrix <inline-formula id="inf63">
<mml:math id="m81">
<mml:mrow>
<mml:msup>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> and the temporal weighted adjacency matrix <inline-formula id="inf64">
<mml:math id="m82">
<mml:mrow>
<mml:msup>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> individually express the correlations. Subsequently, we linearly weighted the sum of both to obtain the weighted adjacency matrix <italic>W</italic>, as shown in <xref ref-type="disp-formula" rid="e19">Equation 19</xref>.<disp-formula id="e19">
<mml:math id="m83">
<mml:mrow>
<mml:mi mathvariant="bold-italic">W</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">&#x3b1;</mml:mi>
<mml:mi mathvariant="bold-italic">W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">s</mml:mi>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mn mathvariant="bold">1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3b1;</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">W</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">m</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(19)</label>
</disp-formula>where <inline-formula id="inf65">
<mml:math id="m84">
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="&#x7c;">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is a weighting parameter.</p>
<p>Spatial distance correlation: The wind speeds at adjacent sites are correlated due to the influence of the internal atmosphere of the region. Generally, the closer the distance between two sites is, the stronger the correlation between sites is, and <italic>vice versa</italic>. Therefore, we take the actual longitude and latitude of sites as inputs and use the Haversine formula to calculate the spherical distance between sites. If the distance is less than or equal to a threshold parameter <inline-formula id="inf66">
<mml:math id="m85">
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, then we use the Gaussian kernel function to calculate the spatial distance correlation. Otherwise, it indicates that there is no spatial distance correlation between them. For any two sites <inline-formula id="inf67">
<mml:math id="m86">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf68">
<mml:math id="m87">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, the edge weight is calculated according to the following <xref ref-type="disp-formula" rid="e20">Equation 20</xref>.<disp-formula id="e20">
<mml:math id="m88">
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">w</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">s</mml:mi>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="" separators="&#x7c;">
<mml:mrow>
<mml:mtable columnalign="center">
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="bold-italic">exp</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">d</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">s</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn mathvariant="bold">2</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mn mathvariant="bold">2</mml:mn>
<mml:msup>
<mml:mi mathvariant="bold-italic">&#x3c3;</mml:mi>
<mml:mn mathvariant="bold">2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">f</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi mathvariant="bold-italic">d</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">s</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2264;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3b8;</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mn mathvariant="bold">0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">o</mml:mi>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mi mathvariant="bold-italic">w</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">s</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(20)</label>
</disp-formula>where <inline-formula id="inf69">
<mml:math id="m89">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> represents the spherical distance between nodes <inline-formula id="inf70">
<mml:math id="m90">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf71">
<mml:math id="m91">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf72">
<mml:math id="m92">
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is the standard deviation of <inline-formula id="inf73">
<mml:math id="m93">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>.</p>
<p>Time series correlation: The wind speeds between multiple sites are affected not only by the spatial distance, but also by the similarity of their time series. If the wind speeds of two sites are closer to each other at the same time, the time series correlation will be stronger. Due to the nonlinearity of the wind speed series, we adopt MIC to grid partition the data space. The mutual information between <inline-formula id="inf74">
<mml:math id="m94">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf75">
<mml:math id="m95">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and their normalized edge weights are as follows by <xref ref-type="disp-formula" rid="e1">Equations 21, 22</xref>.<disp-formula id="e21">
<mml:math id="m96">
<mml:mrow>
<mml:mi mathvariant="bold-italic">I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x222c;</mml:mo>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">log</mml:mi>
<mml:mn mathvariant="bold">2</mml:mn>
</mml:msub>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mi mathvariant="bold-italic">d</mml:mi>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msup>
<mml:mi mathvariant="bold-italic">d</mml:mi>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(21)</label>
</disp-formula>
<disp-formula id="e22">
<mml:math id="m97">
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">w</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">t</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">m</mml:mi>
<mml:mi mathvariant="bold-italic">e</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="bold-italic">M</mml:mi>
<mml:mi mathvariant="bold-italic">I</mml:mi>
<mml:mi mathvariant="bold-italic">C</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">max</mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi mathvariant="bold-italic">B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="bold-italic">I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">log</mml:mi>
<mml:mn mathvariant="bold">2</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mi mathvariant="bold-italic">min</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">j</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(22)</label>
</disp-formula>where <inline-formula id="inf76">
<mml:math id="m98">
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the maximum information coefficient between nodes and <italic>B</italic> is the number of grids, which is generally taken as the total amount of data 0.6 times (<xref ref-type="bibr" rid="B4">Chen et al., 2016</xref>).</p>
</sec>
<sec id="s3-4">
<title>3.4 mRMR-RF-GraphSAGE</title>
<p>The specific processes of the spatial feature extraction framework based on mRMR-RF-GraphSAGE are as follows:<list list-type="simple">
<list-item>
<p>(1) Use mRMR-RF to filter the most representative and distinguishable optimal feature subset, effectively reducing the original feature data dimensionality and complexity. Feature fusion is then applied to synthesize the selected features into a comprehensive representation;</p>
</list-item>
<list-item>
<p>(2) Construct the spatial feature map topology by integrating location information from multi-site, thereby transforming the data into a graph structured format. The spatial feature map serves as a topological representation of features to convey the geographical relationships and time series correlations between different sites;</p>
</list-item>
<list-item>
<p>(3) Utilize the GraphSAGE to train the graph structured data to generate embedding spatial feature vectors by sampling and aggregating neighbor information, which can fully explore the potential of graph structured data.</p>
</list-item>
</list>
</p>
</sec>
</sec>
<sec id="s4">
<title>4 Spatio-temporal feature extraction framework</title>
<p>The framework proposed in this paper for spatio-temporal feature extraction is articulated into four main components: data preparation, multi-site map topology construction, spatial feature extraction and temporal feature extraction, as illustrated in <xref ref-type="fig" rid="F5">Figure 5</xref>.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Spatio-temporal feature extraction framework.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g005.tif"/>
</fig>
<p>Data preparation: We gather and systematically arrange the geographic location of multi-site, historical wind speed records and other environmental feature data. Using the mRMR-RF method, we select the optimal features to reduce the dimensionality of the original data. The final fused feature is obtained for wind speed prediction by the optimal subset.</p>
<p>Multi-site map topology construction: By utilizing the fused feature, we construct graph structure data that comprehensively incorporates spatial geographic information from multi-site. Each site is depicted as a node, while the edges quantify the spatial distance and time series correlation between sites. Afterwards, the weighted adjacency matrix <italic>W</italic> is constructed by combining the spatial correlation matrix <inline-formula id="inf77">
<mml:math id="m99">
<mml:mrow>
<mml:msup>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> based on the Haversine formula, with the time series correlation matrix <inline-formula id="inf78">
<mml:math id="m100">
<mml:mrow>
<mml:msup>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> derived from the MIC.</p>
<p>Spatial feature extraction: We utilize GraphSAGE to extract spatial features from the multi-site map topology. The uniform sampling and mean aggregation approaches are adopted to sample and aggregate neighboring sites, which yields the feature representation of the central site. Then, the spatial feature representations for each site are generated by a fully connected layer, which captures the complex spatial correlations among multiple sites.</p>
<p>Temporal feature extraction: The spatial feature vectors are temporally resampled using the sliding window method to generate spatio-temporal feature vectors. Subsequently, LSTM is employed for extracting temporal correlation information. The LSTM hidden layer consists of 64 nodes and the time window sampling step is set to 24. Finally, a linear layer and a fully connected layer are utilized to output the predicted values of multiple sites, thereby completing the construction of the spatio-temporal feature extraction framework.</p>
</sec>
<sec id="s5">
<title>5 Case analysis</title>
<sec id="s5-1">
<title>5.1 Simulation experiment setting</title>
<p>This simulation experiment is implemented in Python 3.11, using an open-source machine learning platform that includes a GPU-accelerated version of the PyTorch 2.1.1 framework. This platform utilizes CUDA and cuDNN to optimize training and inference speed for deep neural networks, fully exerting the computational power of the GPU. The simulation hardware platform settings: CPU is Intel i7-13620H 2.4GHz, RAM is 16GB, and GPU is NVIDIA GeForce RTX 4060 8&#xa0;GB. To prevent overfitting, we make adjustments by randomly shuffling the order of the input samples. Additionally, a fixed random seed is set to ensure reproducibility by eliminating potential impacts from random factors in deep learning models. The model training is set to 200 epochs, with a batch size of 72. The training minimizes the loss through the mean absolute error (MAE) approach and Adam optimizer while setting the weight decay at <inline-formula id="inf79">
<mml:math id="m101">
<mml:mrow>
<mml:msup>
<mml:mn>10</mml:mn>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>5</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> (<xref ref-type="bibr" rid="B12">Kingma and Ba, 2017</xref>).</p>
</sec>
<sec id="s5-2">
<title>5.2 Evaluation metrics</title>
<p>In this study, the mean absolute error (MAE) and root mean square error (RMSE) are selected as model evaluation metrics. The calculation formulas of MAE and RMSE are as follows by <xref ref-type="disp-formula" rid="e23">Equation 23</xref>, <xref ref-type="disp-formula" rid="e24">24</xref>.<disp-formula id="e23">
<mml:math id="m102">
<mml:mrow>
<mml:mi mathvariant="bold-italic">M</mml:mi>
<mml:mi mathvariant="bold-italic">A</mml:mi>
<mml:mi mathvariant="bold-italic">E</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">n</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">n</mml:mi>
</mml:msubsup>
</mml:mstyle>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(23)</label>
</disp-formula>
<disp-formula id="e24">
<mml:math id="m103">
<mml:mrow>
<mml:mtext mathvariant="bold">RMSE</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold">n</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold">n</mml:mi>
</mml:msubsup>
</mml:mstyle>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold">y</mml:mi>
<mml:mi mathvariant="bold">i</mml:mi>
</mml:msub>
<mml:mo>&#x2010;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi mathvariant="bold">y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi mathvariant="bold">i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn mathvariant="bold">2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:msqrt>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(24)</label>
</disp-formula>where <inline-formula id="inf80">
<mml:math id="m104">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the true observed value of wind speed, <inline-formula id="inf81">
<mml:math id="m105">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the model output predicted value of wind speed, and <inline-formula id="inf82">
<mml:math id="m106">
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is the total number of data samples.</p>
</sec>
<sec id="s5-3">
<title>5.3 Data description</title>
<p>The wind energy integrated national dataset is sourced from the National Renewable Energy Laboratory (NREL) in the United States (<xref ref-type="bibr" rid="B9">Jager and Andreas, 1996</xref>). This dataset includes meteorological conditions at different height positions for more than 2,488,136 sites in the continental United States for the years 2007&#x2013;2014. The dataset features 2&#xa0;km spatial resolution and 15-min temporal resolution. This dataset consists of 32 sets of features as shown in <xref ref-type="table" rid="T1">Table 1</xref>, including information such as wind speed, wind direction, temperature and air pressure. The selected dataset for this study comes from 49 sites within the Roscoe Wind Farm area in Texas, United States, and covers the entire year of 2014. However, it should be noted that while wind turbines are typically arranged in a regular rectangular grid pattern within a wind farm, most turbines have an irregular distribution shape. Therefore, we randomly selected 20 sites with uneven distributions from all available sites as shown in <xref ref-type="fig" rid="F6">Figure 6</xref>. The dataset is divided into three parts. Data from January to October 2014 was used as the training set for model training. This period was chosen to provide sufficient training data for the model. Data from November 2014 was used as the test set to evaluate the generalization ability of the final model. Data from December 2014 was used as the validation set. It help our model fine-tune to find the optimal hyperparameters and prevent overfitting.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Statistical data of the wind station features.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">Feature</th>
<th align="center">Unit</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">Wind speed at 10&#xa0;m, 40&#xa0;m, 60&#xa0;m, 80&#xa0;m, 100&#xa0;m, 120&#xa0;m, 140&#xa0;m, 160&#xa0;m, 200&#xa0;m</td>
<td align="center">m/s</td>
</tr>
<tr>
<td align="center">Wind direction at 10&#xa0;m, 40&#xa0;m, 60&#xa0;m, 80&#xa0;m, 100&#xa0;m, 120&#xa0;m, 140&#xa0;m, 160&#xa0;m, 200&#xa0;m</td>
<td align="center">&#xb0;</td>
</tr>
<tr>
<td align="center">Air temperature at 10&#xa0;m, 40&#xa0;m, 60&#xa0;m, 80&#xa0;m, 100&#xa0;m, 120&#xa0;m, 140&#xa0;m, 160&#xa0;m, 200&#xa0;m</td>
<td align="center">&#xb0;C</td>
</tr>
<tr>
<td align="center">Air pressure at 0&#xa0;m, 100&#xa0;m, 200&#xa0;m</td>
<td align="center">Pa</td>
</tr>
<tr>
<td align="center">Relative humidity</td>
<td align="center">mm/h</td>
</tr>
<tr>
<td align="center">Precipitation rate</td>
<td align="center">%</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Schematic diagram of the multi-site location.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g006.tif"/>
</fig>
</sec>
<sec id="s5-4">
<title>5.4 Experimental results and analysis</title>
<sec id="s5-4-1">
<title>5.4.1 Feature selection analysis</title>
<p>Utilizing the mRMR-RF feature selection algorithm, we initially compute the RF feature importance, filtering out features with an importance greater than the threshold (<inline-formula id="inf83">
<mml:math id="m107">
<mml:mrow>
<mml:mi mathvariant="normal">&#x3b5;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 0.01). These features are exhaustively combined to form feature subsets, and calculate their mRMR scores. As a result, we identify the optimal feature subset with a high correlation to wind speed prediction. As illustrated in <xref ref-type="fig" rid="F7">Figure 7</xref>, these features include wind speed at heights of 80&#xa0;m, 60&#xa0;m, 140&#xa0;m, and 160&#xa0;m. It is also evident from this process that other features were not selected due to their lower correlation with wind speed prediction.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Importance ranking of wind site features.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g007.tif"/>
</fig>
</sec>
<sec id="s5-4-2">
<title>5.4.2 Parameter analysis in map topology construction</title>
<p>In map topology construction, the threshold parameter <inline-formula id="inf84">
<mml:math id="m108">
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> and the weighting parameter <inline-formula id="inf85">
<mml:math id="m109">
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> can influence the constructed graph structure data. The threshold parameter <inline-formula id="inf86">
<mml:math id="m110">
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is selected from the distance between the two nearest sites to the distance between the two farthest sites, with a step length of 1&#xa0;km. The weighting parameter <inline-formula id="inf87">
<mml:math id="m111">
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is chosen from 0 to 1, with a step length of 0.05. The optimal parameters are obtained through exhaustive searching are <inline-formula id="inf88">
<mml:math id="m112">
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 8&#xa0;km and <inline-formula id="inf89">
<mml:math id="m113">
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 0.4. The constructed topology of the multi-site is shown in <xref ref-type="fig" rid="F8">Figure 8</xref>.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Multi-site topology construction.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g008.tif"/>
</fig>
</sec>
<sec id="s5-4-3">
<title>5.4.3 Baseline model</title>
<p>In order to validate the efficacy and superiority of the proposed MRGS-LSTM wind speed prediction model, we conducted a comparative analysis with several widely adopted wind speed prediction algorithms. Afterwards, algorithm selection was performed by comparing various graph neural networks, namely GCN, GAT, and GraphSAGE. For time series prediction evaluation, multilayer perceptron (MLP) and LSTM were compared.</p>
<p>In this paper, we conducted comparative experiments on GCN-MLP, GAT-MLP, GS-MLP, GCN-LSTM, GAT-LSTM, and MRGS-LSTM. <xref ref-type="table" rid="T2">Table 2</xref> lists the MAEs and RMSEs of wind speed prediction for these models. The min, max, average and std are the minimum, maximum, mean and standard deviation of the prediction error for all the sites, respectively. To better evaluate the improvement effect of MRGS-LSTM compared with other prediction models, the calculation formula of imp is as follows by <xref ref-type="disp-formula" rid="e25">Equation 25</xref>.<disp-formula id="e25">
<mml:math id="m114">
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="bold-italic">m</mml:mi>
<mml:mi mathvariant="bold-italic">p</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="&#x7c;">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">E</mml:mi>
<mml:mi mathvariant="bold-italic">o</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">E</mml:mi>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">E</mml:mi>
<mml:mi mathvariant="bold-italic">o</mml:mi>
</mml:msub>
</mml:mfrac>
<mml:mo>&#xd7;</mml:mo>
<mml:mn mathvariant="bold">100</mml:mn>
<mml:mo>%</mml:mo>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(25)</label>
</disp-formula>where <inline-formula id="inf90">
<mml:math id="m115">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> denotes the average improvement value of wind speed accuracy for our model compared to the comparison models, <inline-formula id="inf91">
<mml:math id="m116">
<mml:mrow>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mi>o</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the error value of the comparison models, and <inline-formula id="inf92">
<mml:math id="m117">
<mml:mrow>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the error value of our model.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Comparison of the wind speed prediction errors of multiple sites.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="center">Model</th>
<th colspan="5" align="center">MAE</th>
<th colspan="5" align="center">RMSE</th>
</tr>
<tr>
<th align="center">min</th>
<th align="center">Max</th>
<th align="center">Average</th>
<th align="center">std</th>
<th align="center">Imp (%)</th>
<th align="center">min</th>
<th align="center">Max</th>
<th align="center">Average</th>
<th align="center">std</th>
<th align="center">Imp (%)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">GCN-MLP</td>
<td align="center">0.4973</td>
<td align="center">0.7339</td>
<td align="center">0.5905</td>
<td align="center">0.0645</td>
<td align="center">22.76</td>
<td align="center">0.6955</td>
<td align="center">0.9658</td>
<td align="center">0.8151</td>
<td align="center">0.0744</td>
<td align="center">20.70</td>
</tr>
<tr>
<td align="center">GAT-MLP</td>
<td align="center">0.4497</td>
<td align="center">0.8495</td>
<td align="center">0.6332</td>
<td align="center">0.0827</td>
<td align="center">27.97</td>
<td align="center">0.7125</td>
<td align="center">1.1509</td>
<td align="center">0.8665</td>
<td align="center">0.1015</td>
<td align="center">25.40</td>
</tr>
<tr>
<td align="center">GS-MLP</td>
<td align="center">0.4345</td>
<td align="center">0.6856</td>
<td align="center">0.5393</td>
<td align="center">0.0707</td>
<td align="center">15.43</td>
<td align="center">0.6157</td>
<td align="center">0.8977</td>
<td align="center">0.7393</td>
<td align="center">0.0808</td>
<td align="center">12.57</td>
</tr>
<tr>
<td align="center">GCN-LSTM</td>
<td align="center">0.477</td>
<td align="center">0.6843</td>
<td align="center">0.5608</td>
<td align="center">0.0579</td>
<td align="center">18.66</td>
<td align="center">0.6614</td>
<td align="center">0.9148</td>
<td align="center">0.7664</td>
<td align="center">0.0762</td>
<td align="center">15.66</td>
</tr>
<tr>
<td align="center">GAT-LSTM</td>
<td align="center">0.4763</td>
<td align="center">1.055</td>
<td align="center">0.6105</td>
<td align="center">0.1261</td>
<td align="center">25.29</td>
<td align="center">0.664</td>
<td align="center">1.2861</td>
<td align="center">0.8212</td>
<td align="center">0.1376</td>
<td align="center">21.29</td>
</tr>
<tr>
<td align="center">MRGS-LSTM</td>
<td align="center">0.4158</td>
<td align="center">0.5326</td>
<td align="center">0.4561</td>
<td align="center">0.0327</td>
<td align="center">0.00</td>
<td align="center">0.5906</td>
<td align="center">0.7562</td>
<td align="center">0.6464</td>
<td align="center">0.0452</td>
<td align="center">0.00</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The minimum value of the error for these models is indicated in bold in the <xref ref-type="table" rid="T2">Table 2</xref>. Our proposed MRGS-LSTM model in this study exhibits the smallest MAE and RMSE values, outperforming all other prediction models. Compared to GCN-MLP, GAT-MLP, and GS-MLP which do not incorporate temporal features, MRGS-LSTM achieves 22.76%, 27.97%, and 15.43% improvements in the statistical mean of MAE respectively, as well as 20.70%, 25.40%, and 12.57% improvements in the statistical mean of RMSE respectively. Furthermore, when compared to GCN-LSTM and GAT-LSTM, which utilize other GNNs for spatial feature extraction, MRGS-LSTM improves the mean MAE by 18.66% and 25.29%, respectively, and improves the mean RMSE by 15.66% and 21.29%, respectively. Moreover, using MRGS-LSTM for wind speed prediction, the error ranges of MAE and RMSE are between 0.4158&#xa0;m/s and 0.5326&#xa0;m/s and between 0.5906&#xa0;m/s and 0.7562&#xa0;m/s, respectively. Also, the average and std are significantly smaller than those obtained from other prediction models. This confirms that our proposed MRGS-LSTM model exhibits more accurate and stable performance in spatio-temporal feature extraction for wind farm.</p>
<p>The prediction errors of the MRGS-LSTM model at 20 different sites are shown in <xref ref-type="fig" rid="F9">Figure 9</xref>, where the size of the balls reflects the magnitude of MAE and RMSE values. The predicted results and the feature map topology are correlated. Sites closer to the center exhibit better prediction accuracy due to their incorporation of wind speed features from neighboring sites. Conversely, sites located at peripheral positions exhibit poorer prediction performance because they have a lower correlation with surrounding sites. <xref ref-type="fig" rid="F10">Figure 10</xref> shows the prediction results of the two sites under the different prediction models from December 1 to 15, 2014. <xref ref-type="fig" rid="F10">Figure 10A</xref> shows the wind speed prediction curve for site &#x23;11, which has the best MAE and RMSE prediction error values. <xref ref-type="fig" rid="F10">Figure 10B</xref> shows the wind speed prediction curve for site &#x23;2, which has the worst MAE and RMSE prediction error values. Comparison with models reveals that our proposed model has better wind speed tracking ability, especially when the wind speed fluctuates greatly as seen by the prediction starting at period 680.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>The prediction errors of multi-site under the MRGS-LSTM, the color of the ball indicates the MAE and RMSE values of the wind speed prediction results. <bold>(A)</bold> the MAE of multi-site. <bold>(B)</bold> the RMSE of multi-site.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g009.tif"/>
</fig>
<fig id="F10" position="float">
<label>FIGURE 10</label>
<caption>
<p>Prediction results of each prediction model. <bold>(A)</bold> Site &#x23;11. <bold>(B)</bold> Site &#x23;2.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g010.tif"/>
</fig>
</sec>
<sec id="s5-4-4">
<title>5.4.4 Model training analysis</title>
<p>The training times of the different models are compared in <xref ref-type="table" rid="T3">Table 3</xref>. Our MRGS-LSTM model required the shortest training duration. <xref ref-type="fig" rid="F11">Figure 11A</xref> shows the convergence curves of the prediction models on the training set. As the number of training epochs increases, these curves gradually decrease. This indicates that the models are constantly learning the known data, reducing the errors, and improving the performance. <xref ref-type="fig" rid="F11">Figure 11B</xref> exhibits the convergence curves of the prediction models on the validation set of the dataset, demonstrating their effectiveness on unknown data. In both figures, it is evident that our MRGS-LSTM model surpasses the other models in terms of both the training and validation sets while exhibiting remarkable generalization ability without overfitting.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Comparison of training times for different models.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">Model</th>
<th align="center">Total time (s)</th>
<th align="center">Average time per epoch (s)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">GCN-MLP</td>
<td align="center">6503.578</td>
<td align="center">32.518</td>
</tr>
<tr>
<td align="center">GAT-MLP</td>
<td align="center">21,462.422</td>
<td align="center">107.312</td>
</tr>
<tr>
<td align="center">GS-MLP</td>
<td align="center">5143.611</td>
<td align="center">25.718</td>
</tr>
<tr>
<td align="center">GCN-LSTM</td>
<td align="center">6061.439</td>
<td align="center">30.307</td>
</tr>
<tr>
<td align="center">GAT-LSTM</td>
<td align="center">21,637.675</td>
<td align="center">108.188</td>
</tr>
<tr>
<td align="center">MRGS-LSTM</td>
<td align="center">4649.696</td>
<td align="center">23.248</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F11" position="float">
<label>FIGURE 11</label>
<caption>
<p>Convergence curves of the prediction models. <bold>(A)</bold> on the training set. <bold>(B)</bold> on the validation set.</p>
</caption>
<graphic xlink:href="fenrg-12-1427587-g011.tif"/>
</fig>
</sec>
</sec>
</sec>
<sec sec-type="conclusion" id="s6">
<title>6 Conclusion</title>
<p>This study proposes a novel multi-site wind speed prediction model, namely MRGS-LSTM, which leverages historical data on wind speed features from multi-site within the Roscoe Wind Farm in Texas, United States. By incorporating both spatial and temporal correlations, our proposed model enables accurate wind speed predictions at various locations.</p>
<p>Our approach effectively captures the topology of sites distribution within the wind farm and extracts spatio-temporal correlations from different features to enhance prediction accuracy. Firstly, we employ the mRMR-RF algorithm to obtain the fused feature with highly relevant and minimally redundant. Subsequently, based on spatial distance correlation and time series correlation between sites, we construct a weighted adjacency matrix that serves as input for the GraphSAGE network to update and aggregate wind speed features at each site, resulting in spatial feature vectors. Next, these spatial feature vectors are resampled by a sliding time window. The model utilizes LSTM memory cells to capture historical information from long time series data and obtain integrated spatio-temporal predictions. Finally, simulation experiments conduct with real historical data validate the effectiveness and accuracy of our proposed MRGS-LSTM model.</p>
<p>Although our model demonstrates promising results in wind speed prediction, further exploration is required regarding more precise integration of various spatial features for accurately predicting field-specific wind speeds. In future research endeavors, we plan to incorporate atmospheric factors such as air pressure and temperature as exogenous variables into the learning process to further enhance predictive capabilities.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s7">
<title>Data availability statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found here: <ext-link ext-link-type="uri" xlink:href="https://www.nrel.gov/grid/wind-toolkit.html">https://www.nrel.gov/grid/wind-toolkit.html</ext-link>, accessed on 20 February 2024.</p>
</sec>
<sec id="s8">
<title>Author contributions</title>
<p>YZ: Writing&#x2013;review and editing, Writing&#x2013;original draft, Visualization, Methodology, Investigation, Data curation, Conceptualization. XF: Writing&#x2013;original draft, Methodology, Funding acquisition, Formal Analysis, Conceptualization.</p>
</sec>
<sec sec-type="funding-information" id="s9">
<title>Funding</title>
<p>The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This research was supported by the Key Research and Development Program of Hubei Province, China (Grant number 2021BAA193).</p>
</sec>
<sec sec-type="COI-statement" id="s10">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s11">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bai</surname>
<given-names>W. W.</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>M. X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W. W.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>T.</given-names>
</name>
<etal/>
</person-group> (<year>2024</year>). <article-title>Multi-step prediction of wind power based on hybrid model with improved variational mode decomposition and sequence-to-sequence network</article-title>. <source>Processes</source> <volume>12</volume> (<issue>1</issue>), <fpage>191</fpage>. <pub-id pub-id-type="doi">10.3390/pr12010191</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cadenas</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Rivera</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Campos-Amezcua</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Heard</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Wind speed prediction using a univariate ARIMA model and a multivariate NARX model</article-title>. <source>Energies (Basel).</source> <volume>9</volume> (<issue>2</issue>), <fpage>109</fpage>. <pub-id pub-id-type="doi">10.3390/en9020109</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>W. J.</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>W. W.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>D.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Ultra-short-term wind power prediction based on bidirectional gated recurrent unit and transfer learning</article-title>. <source>Front. Energy Res.</source> <volume>9</volume>. <pub-id pub-id-type="doi">10.3389/fenrg.2021.808116</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Yuan</surname>
<given-names>Z. M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>A new algorithm to optimize maximal information coefficient</article-title>. <source>Plos One</source> <volume>11</volume> (<issue>6</issue>), <fpage>e0157567</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0157567</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Geng</surname>
<given-names>X. L.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>L. Y.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>X. Y.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Graph optimization neural network with spatio-temporal correlation learning for multi-node offshore wind speed forecasting</article-title>. <source>Renew. Energy</source> <volume>180</volume>, <fpage>1014</fpage>&#x2013;<lpage>1025</lpage>. <pub-id pub-id-type="doi">10.1016/j.renene.2021.08.066</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hamilton</surname>
<given-names>W. L.</given-names>
</name>
<name>
<surname>Ying</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Leskovec</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Inductive representation learning on large graphs</article-title>. <source>Arxiv</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1706.02216</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>H. Y.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>F. Y.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>D. S.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Multiplex parallel GAT-ALSTM: a novel spatial-temporal learning model for multi-sites wind power collaborative forecasting</article-title>. <source>Front. Energy Res.</source> <volume>10</volume>. <pub-id pub-id-type="doi">10.3389/fenrg.2022.974682</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hutchinson</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2023</year>). <source>Global wind report 2023</source>. <publisher-loc>Rue de Commerce 31, 1000 Brussels, Belgium</publisher-loc>: <publisher-name>GWEC Europe Office</publisher-name>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://gwec.net/wp-content/uploads/2023/04/GWEC-2023_interactive.pdf">https://gwec.net/wp-content/uploads/2023/04/GWEC-2023_interactive.pdf</ext-link> (Accessed February 10, 2024)</comment>.</citation>
</ref>
<ref id="B9">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Jager</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Andreas</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>1996</year>). <source>NREL national wind technology center (NWTC): M2 tower; boulder, Colorado (data)</source>. <publisher-loc>Golden, CO, USA</publisher-loc>: <publisher-name>National Renewable Energy Lab</publisher-name>.</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khodayar</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J. H.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Spatio-temporal graph deep neural network for short-term wind speed forecasting</article-title>. <source>Ieee Trans. Sustain. Energy</source> <volume>10</volume> (<issue>2</issue>), <fpage>670</fpage>&#x2013;<lpage>681</lpage>. <pub-id pub-id-type="doi">10.1109/tste.2018.2844102</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khosravi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Koury</surname>
<given-names>R. N. N.</given-names>
</name>
<name>
<surname>Machado</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Pabon</surname>
<given-names>J. J. G.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Prediction of wind speed and wind direction using artificial neural network, support vector regression and adaptive neuro-fuzzy inference system</article-title>. <source>Sustain. Energy Technol. Assessments</source> <volume>25</volume>, <fpage>146</fpage>&#x2013;<lpage>160</lpage>. <pub-id pub-id-type="doi">10.1016/j.seta.2018.01.001</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kingma</surname>
<given-names>D. P.</given-names>
</name>
<name>
<surname>Ba</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Adam: a method for stochastic optimization</article-title>. <source>Arxiv.</source> <pub-id pub-id-type="doi">10.48550/arXiv.1412.6980</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>C. S.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>X. M.</given-names>
</name>
<name>
<surname>Saeed</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Short-term wind speed interval prediction based on ensemble GRU model</article-title>. <source>Ieee Trans. Sustain. Energy</source> <volume>11</volume> (<issue>3</issue>), <fpage>1370</fpage>&#x2013;<lpage>1380</lpage>. <pub-id pub-id-type="doi">10.1109/tste.2019.2926147</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Short-term wind power prediction via spatial temporal analysis and deep residual networks</article-title>. <source>Front. Energy Res.</source> <volume>10</volume>. <pub-id pub-id-type="doi">10.3389/fenrg.2022.920407</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>Z. H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X. F.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y. R.</given-names>
</name>
<name>
<surname>Jia</surname>
<given-names>Y. Y.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>A novel offshore wind farm typhoon wind speed prediction model based on PSO&#x2013;Bi-LSTM improved by VMD</article-title>. <source>Energy</source> <volume>251</volume>, <fpage>123848</fpage>. <pub-id pub-id-type="doi">10.1016/j.energy.2022.123848</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2023a</year>). <article-title>Enhancing short-term wind speed forecasting using graph attention and frequency-enhanced mechanisms</article-title>. <source>Arxiv</source>. <comment>arXiv:2305.11526</comment>. <pub-id pub-id-type="doi">10.48550/arXiv.2305.11526</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Mi</surname>
<given-names>X. W.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y. F.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Smart multi-step deep learning model for wind speed forecasting based on variational mode decomposition, singular spectrum analysis, LSTM network and ELM</article-title>. <source>Energy Convers. Manag.</source> <volume>159</volume>, <fpage>54</fpage>&#x2013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1016/j.enconman.2018.01.010</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>J. J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>X. L.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>D. H.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Z. L.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>F. J.</given-names>
</name>
</person-group> (<year>2023b</year>). <article-title>Adaptive graph-learning convolutional network for multi-node offshore wind speed forecasting</article-title>. <source>J. Mar. Sci. Eng.</source> <volume>11</volume> (<issue>4</issue>), <fpage>879</fpage>. <pub-id pub-id-type="doi">10.3390/jmse11040879</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Z. Y.</given-names>
</name>
<name>
<surname>Ware</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Capturing spatial influence in wind prediction with a graph convolutional neural network</article-title>. <source>Front. Environ. Sci.</source> <volume>10</volume>. <pub-id pub-id-type="doi">10.3389/fenvs.2022.836050</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nielson</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bhaganagar</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Using field data&#x2013;based large eddy simulation to understand role of atmospheric stability on energy production of wind turbines</article-title>. <source>Wind Eng.</source> <volume>43</volume>, <fpage>625</fpage>&#x2013;<lpage>638</lpage>. <pub-id pub-id-type="doi">10.1177/0309524x18824540</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nielson</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bhaganagar</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Meka</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Alaeddini</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Using atmospheric inputs for Artificial Neural Networks to improve wind turbine power prediction</article-title>. <source>Energy</source> <volume>190</volume>, <fpage>116273</fpage>. <pub-id pub-id-type="doi">10.1016/j.energy.2019.116273</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Zhen</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Yin</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y. G.</given-names>
</name>
<etal/>
</person-group> (<year>2022</year>). <article-title>Dynamic spatio-temporal correlation and hierarchical directed graph structure based ultra-short-term wind farm cluster power forecasting method</article-title>. <source>Appl. Energy</source> <volume>323</volume>, <fpage>119579</fpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2022.119579</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Y. Q.</given-names>
</name>
<name>
<surname>Gui</surname>
<given-names>R. Z.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>A hybrid model for GRU ultra-short-term wind speed prediction based on tsfresh and sparse PCA</article-title>. <source>Energies</source> <volume>15</volume> (<issue>20</issue>), <fpage>7567</fpage>. <pub-id pub-id-type="doi">10.3390/en15207567</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Y. W.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y. Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y. L.</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>Interval forecasting method of aggregate output for multiple wind farms using LSTM networks and time-varying regular vine copulas</article-title>. <source>Processes</source> <volume>11</volume> (<issue>5</issue>), <fpage>1530</fpage>. <pub-id pub-id-type="doi">10.3390/pr11051530</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>Q. Y.</given-names>
</name>
<name>
<surname>Guan</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Lv</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Y. Z.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Ultra-short-term multi-step wind power forecasting based on CNN-LSTM</article-title>. <source>Iet Renew. Power Gener.</source> <volume>15</volume> (<issue>5</issue>), <fpage>1019</fpage>&#x2013;<lpage>1029</lpage>. <pub-id pub-id-type="doi">10.1049/rpg2.12085</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Yin</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting</article-title>. <source>Arxiv</source>, <fpage>3634</fpage>&#x2013;<lpage>3640</lpage>. <pub-id pub-id-type="doi">10.24963/ijcai.2018/505</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>E. B.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>G. J.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y. L.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>An efficient short-term wind speed prediction model based on cross-channel data integration and attention mechanisms</article-title>. <source>Energy</source> <volume>256</volume>, <fpage>124569</fpage>. <pub-id pub-id-type="doi">10.1016/j.energy.2022.124569</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zeng</surname>
<given-names>W. L.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Quan</surname>
<given-names>Z. B.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>X. B.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A deep graph-embedded LSTM neural network approach for airport delay prediction</article-title>. <source>J. Adv. Transp.</source> <volume>2021</volume>, <fpage>1</fpage>&#x2013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1155/2021/6638130</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y. G.</given-names>
</name>
<name>
<surname>Pan</surname>
<given-names>G. F.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>J. Y.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>C. H.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Short-term wind speed prediction model based on GA-ANN improved by VMD</article-title>. <source>Renew. Energy</source> <volume>156</volume>, <fpage>1373</fpage>&#x2013;<lpage>1388</lpage>. <pub-id pub-id-type="doi">10.1016/j.renene.2019.12.047</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Z. D.</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Qin</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y. Q.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>X.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Wind speed prediction method using shared weight long short-term memory network and Gaussian process regression</article-title>. <source>Appl. Energy</source> <volume>247</volume>, <fpage>270</fpage>&#x2013;<lpage>284</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2019.04.047</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>X. X.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>R. Z.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>X. X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Z. X.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Wind speed behaviors feather analysis and its utilization on wind speed prediction using 3D-CNN</article-title>. <source>Energy</source> <volume>236</volume>, <fpage>121523</fpage>. <pub-id pub-id-type="doi">10.1016/j.energy.2021.121523</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>