<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Earth Sci.</journal-id>
<journal-title>Frontiers in Earth Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Earth Sci.</abbrev-journal-title>
<issn pub-type="epub">2296-6463</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">696792</article-id>
<article-id pub-id-type="doi">10.3389/feart.2021.696792</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Earth Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>BS-LSTM: An Ensemble Recurrent Approach to Forecasting Soil Movements in the Real World</article-title>
<alt-title alt-title-type="left-running-head">Kumar et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">BS-LSTM to Forecasting Soil Movements</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Kumar</surname>
<given-names>Praveen</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1283945/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sihag</surname>
<given-names>Priyanka</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1438148/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Chaturvedi</surname>
<given-names>Pratik</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/503901/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Uday</surname>
<given-names>K.V.</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1278873/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Dutt</surname>
<given-names>Varun</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/48650/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>Applied Cognitive Science Lab, Indian Institute of Technology Mandi, <addr-line>Himachal Pradesh</addr-line>, <country>India</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>Defence Terrain Research Laboratory, Defence Research and Development Organization (DRDO), <addr-line>New Delhi</addr-line>, <country>India</country>
</aff>
<aff id="aff3">
<label>
<sup>3</sup>
</label>Geohazard Studies Laboratory, Indian Institute of Technology Mandi, <addr-line>Himachal Pradesh</addr-line>, <country>India</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1050999/overview">Hong Haoyuan</ext-link>, University of Vienna, Austria</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1326809/overview">Sima Siami Namini</ext-link>, Mississippi State University, United&#x20;States</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1334780/overview">Ranjan Kumar Behera</ext-link>, Veer Surendra Sai University of Technology, India</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Varun Dutt, <email>varun@iitmandi.ac.in</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Environmental Informatics and Remote Sensing, a section of the journal Frontiers in Earth Science</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>23</day>
<month>08</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>9</volume>
<elocation-id>696792</elocation-id>
<history>
<date date-type="received">
<day>27</day>
<month>04</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>08</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Kumar, Sihag, Chaturvedi, Uday and Dutt.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Kumar, Sihag, Chaturvedi, Uday and Dutt</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Machine learning (ML) proposes an extensive range of techniques, which could be applied to forecasting soil movements using historical soil movements and other variables. For example, researchers have proposed recurrent ML techniques like the long short-term memory (LSTM) models for forecasting time series variables. However, the application of novel LSTM models for forecasting time series involving soil movements is yet to be fully explored. The primary objective of this research is to develop and test a new ensemble LSTM technique (called &#x201c;Bidirectional-Stacked-LSTM&#x201d; or &#x201c;BS-LSTM&#x201d;). In the BS-LSTM model, forecasts of soil movements are derived from a bidirectional LSTM for a period. These forecasts are then fed into a stacked LSTM to derive the next period&#x2019;s forecast. For developing the BS-LSTM model, datasets from two real-world landslide sites in India were used: Tangni (Chamoli district) and Kumarhatti (Solan district). The initial 80% of soil movements in both datasets were used for model training and the last 20% of soil movements in both datasets were used for model testing. The BS-LSTM model&#x2019;s performance was compared to other LSTM variants, including a simple LSTM, a bidirectional LSTM, a stacked LSTM, a CNN-LSTM, and a Conv-LSTM, on both datasets. Results showed that the BS-LSTM model outperformed all other LSTM model variants during training and test in both the Tangni and Kumarhatti datasets. This research highlights the utility of developing recurrent ensemble models for forecasting soil movements ahead of&#x20;time.</p>
</abstract>
<kwd-group>
<kwd>soil movements</kwd>
<kwd>time-series forecasting</kwd>
<kwd>recurrent models</kwd>
<kwd>simple LSTMs</kwd>
<kwd>stacked LSTMs</kwd>
<kwd>bidirectional LSTMs</kwd>
<kwd>conv-LSTMs</kwd>
<kwd>CNN-LSTMs</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>Landslides are like a plague for Himalayan regions. These disasters present a significant threat to life and property in many regions of India, especially in the mountain regions of Uttarakhand and Himachal Pradesh (<xref ref-type="bibr" rid="B34">Pande, 2006</xref>). The landslides cause enormous damages to property and life every year (<xref ref-type="bibr" rid="B40">Surya, 2011</xref>). To reduce the landslide risks and the losses due to landslides, modeling and forecasting of landslides and associated soil movements in real-time is much needed (<xref ref-type="bibr" rid="B43">Van Westen et&#x20;al., 1997</xref>). This forecasting may help inform people about impending soil movements in landslide-prone areas (<xref ref-type="bibr" rid="B10">Chaturvedi et&#x20;al., 2017</xref>). One could forecast soil movements and generate warnings using the historical soil movements and weather parameter values (<xref ref-type="bibr" rid="B21">Korup and Stolle, 2014</xref>). Real-time landslide monitoring stations provide ways by which data of soil movements and weather parameters could be recorded in real-time (<xref ref-type="bibr" rid="B35">Pathania et&#x20;al., 2020</xref>). Once these data are collected, one may develop machine learning (ML) models to forecast soil movements (<xref ref-type="bibr" rid="B23">Kumar et&#x20;al., 2019a</xref>; <xref ref-type="bibr" rid="B24">2019b</xref>, <xref ref-type="bibr" rid="B22">2020</xref>; <xref ref-type="bibr" rid="B25">2021a</xref>; <xref ref-type="bibr" rid="B26">2021b</xref>). Such ML models may take prior values as inputs and forecast the value of interest ahead of time (<xref ref-type="bibr" rid="B7">Behera et&#x20;al., 2018</xref>, <xref ref-type="bibr" rid="B5">2021a</xref>; <xref ref-type="bibr" rid="B27">Kumari et&#x20;al., 2020</xref>).</p>
<p>In the ML literature, the RNN models such as the long short-term memory (LSTM) have been developed for forecasting soil movements (<xref ref-type="bibr" rid="B44">Xing et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B47">Yang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B18">Jiang et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B29">Liu et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B31">Meng et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B33">Niu et&#x20;al., 2021</xref>). These recurrent models possess internal memory and they are a generalization of the feedforward neural networks (<xref ref-type="bibr" rid="B30">Medsker and Jain, 1999</xref>). Such recurrent models perform the same function for each data input, and the output of the current input is dependent on prior computations (<xref ref-type="bibr" rid="B32">Mikolov et&#x20;al., 2011</xref>). Some researchers have forecasted soil movements by developing a single-layer LSTM model, where the model used historical soil movements in a time series to forecast future movements (<xref ref-type="bibr" rid="B46">Xu and Niu, 2018</xref>; <xref ref-type="bibr" rid="B47">Yang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B18">Jiang et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B29">Liu et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B31">Meng et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B45">Xing et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B33">Niu et&#x20;al., 2021</xref>). For example, <xref ref-type="bibr" rid="B33">Niu et&#x20;al. (2021)</xref> developed an ensemble of the empirical mode decomposition (EEMD) and RNN model (EEMD-RNN) to forecasting soil movements. Furthermore, <xref ref-type="bibr" rid="B18">Jiang et&#x20;al. (2020)</xref> developed an ensemble of the LSTM and support vector regression (SVR) models to forecast soil movements. Similarly, a stacked LSTM model was developed by stacking LSTM layers to forecast the soil movements (<xref ref-type="bibr" rid="B44">Xing et&#x20;al., 2019</xref>). Beyond some of these attempts for forecasting soil movements, there have been some attempts at developing RNN models for the time series forecasting problems across different domains (<xref ref-type="bibr" rid="B15">Huang et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B7">Behera et&#x20;al., 2018</xref>, <xref ref-type="bibr" rid="B6">2021b</xref>; <xref ref-type="bibr" rid="B37">Qiu et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B49">Zhang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B2">Barzegar et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B11">Cui et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B39">Singh et&#x20;al., 2020</xref>). For example, convolutional LSTM (Conv-LSTM), bidirectional LSTM (Bi-LSTM), and CNN-LSTM models have been developed in the natural language processing (NLP), crowd time series forecasting, software reliability assessment, and water quality variable forecasting (<xref ref-type="bibr" rid="B15">Huang et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B7">Behera et&#x20;al., 2018</xref>, <xref ref-type="bibr" rid="B6">2021b</xref>; <xref ref-type="bibr" rid="B37">Qiu et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B49">Zhang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B2">Barzegar et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B11">Cui et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B39">Singh et&#x20;al., 2020</xref>). However, a comprehensive evaluation of these RNN models has not yet been performed for soil movement forecasting. Furthermore, the development and evaluation of novel ensembles of RNN models for soil movement forecasting in hilly areas are yet to be explored.</p>
<p>The primary goal of this research is to bridge these literature gaps and develop new ensemble RNN techniques, which have not been explored before. Specifically, in this research, we create a new RNN ensemble model called &#x201c;Bidirectional-Stacked-LSTM&#x201d; or &#x201c;BS-LSTM&#x201d; to forecast soil movements at known real-world landslides in India&#x2019;s Himalayan states. The new BS-LSTM model combines a stacked LSTM model and a bidirectional LSTM model to forecast soil movements. In the BS-LSTM model, first, forecasts of soil movements are derived from a bidirectional LSTM for a period. These forecasts are then fed into a stacked LSTM to derive the next time period&#x2019;s forecast. For the development and testing of the BS-LSTM model, we collected soil movement data from two real-world landslide sites in the Himalayan mountains in India: the Tangni site (Chamoli district) and the Kumarhatti site (Solan district). The Chamoli district has suffered several landslides in the recent past (<xref ref-type="bibr" rid="B20">Khanduri, 2018</xref>). A total of 220 landslides were recorded in this area in 2013, causing many deaths and massive damages to infrastructure (<xref ref-type="bibr" rid="B20">Khanduri, 2018</xref>). The Solan district has also been prone to landslides in Himachal Pradesh, and many landslide incidents have been recorded in this district (<xref ref-type="bibr" rid="B9">Chand, 2014</xref>; <xref ref-type="bibr" rid="B19">Kahlon et&#x20;al., 2014</xref>). The world heritage Kalka - Shimla railway line passes through the Kumarhatti site in the Solan district (<xref ref-type="bibr" rid="B16">ICOMOS, 2008</xref>). The debris flow at the Kumarhatti site has often damaged the Kalka - Shimla railway line (<xref ref-type="bibr" rid="B40">Surya, 2011</xref>; <xref ref-type="bibr" rid="B9">Chand, 2014</xref>). In this research, using the soil movement data of Tangni and Kumarhatti sites, we compare the performance of the BS-LSTM model to the performance of other LSTM variants, including a simple LSTM, a bidirectional LSTM, a stacked LSTM, a CNN-LSTM, and a Conv-LSTM. The primary novelty of this work is to propose a new BS-LSTM model and compare this new model&#x2019;s performance to the existing state-of-the-art RNN models for soil movement forecasting. To the best of the authors&#x2019; knowledge, this work is the first of its kind to propose an ensemble of RNN models for soil movement forecasting.</p>
<p>First, we present a review of the literature on machine learning models for forecasting soil movements. The method for calibrating different models to forecast the soil movement data from the Tangni and Kumarhatti sites is described next in detail. Finally, we provide the results from various models and explore their implications for real-world soil movement forecasts.</p>
<sec id="s1-1">
<title>Background</title>
<p>Several research studies have proposed RNN models to forecast soil movements and determine various triggering parameters for such movements (<xref ref-type="bibr" rid="B46">Xu and Niu, 2018</xref>; <xref ref-type="bibr" rid="B44">Xing et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B47">Yang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B18">Jiang et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B29">Liu et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B31">Meng et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B33">Niu et&#x20;al., 2021</xref>). For example, <xref ref-type="bibr" rid="B47">Yang et&#x20;al. (2019)</xref> created an LSTM model for forecasting soil movements in China&#x2019;s Three Gorges Reservoir area. The model was trained using reservoir water level, rainfall, and soil movement data. In this experiment, the support vector machine (SVM) model was also developed to compare with the LSTM model to forecast soil movements. From the results, it was found that LSTMs could effectively forecast soil movements. Similarly, <xref ref-type="bibr" rid="B46">Xu and Niu (2018)</xref> developed LSTM models to forecast the Baijiabao landslide&#x2019;s displacement in China. The developed model was compared with a SVR and a backpropagation neural network. The LSTM model performed better than the SVR and neural network models. Furthermore, <xref ref-type="bibr" rid="B45">Xing et&#x20;al. (2020)</xref> created LSTM and SVR models to forecast soil movements of the Baishuihe landslide in China. The findings indicated that the LSTM model could be used to forecast soil movements. Next, <xref ref-type="bibr" rid="B29">Liu et&#x20;al. (2020)</xref> created LSTM, gated recurrent unit (GRU), and random forest (RF) models to forecast soil movements. These models were trained on the data recorded from the Three Gorges Reservoir, China. Results showed that GRU and LSTM models performed better compared to the RF model for forecasting soil movements. Furthermore, <xref ref-type="bibr" rid="B31">Meng et&#x20;al. (2020)</xref> created an LSTM model and trained this model on the data collected from the Baishuihe landslide in the China. The recorded data from this landslide included different parameters like weather, rainfall, and soil movements. The univariate and multivariate versions of LSTM models were created on this dataset. The results revealed that the multivariate LSTM model performed better without overfitting. Similarly, <xref ref-type="bibr" rid="B44">Xing et&#x20;al. (2019)</xref> developed a stacked LSTM model, where the sequence of soil movements were split into different subsequences. Next, the model used these subsequences to forecast soil movements. <xref ref-type="bibr" rid="B33">Niu et&#x20;al. (2021)</xref> created an ensemble of EEMD and RNN models (the EEMD-RNN model). The proposed EEMD-RNN model was evaluated and compared to standard RNN, GRU, and simple LSTM models. The results showed that the EEMD-RNN model outperformed the individual RNN, GRU, and LSTM models. Next, <xref ref-type="bibr" rid="B18">Jiang et&#x20;al. (2020)</xref> developed an ensemble of simple LSTM and SVR models to forecast soil movements. The Shengjibao landslide in the Three Gorges Reservoir area in China was taken as a case study. The results showed that the ensemble model outperformed the individual simple LSTM and SVR models.</p>
<p>In addition to the machine-learning literature related to landslides, several ensembles of RNN models have been tried for general sequence forecast and specific landslide susceptibility forecasting problems (<xref ref-type="bibr" rid="B15">Huang et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B37">Qiu et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B49">Zhang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B2">Barzegar et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B11">Cui et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B39">Singh et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B42">Wang et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B6">Behera et&#x20;al., 2021b</xref>). However, ensemble RNN approaches for forecasting soil movements have yet to be developed. For example, <xref ref-type="bibr" rid="B15">Huang et&#x20;al. (2015)</xref> created a Bi-LSTM model for sequence labeling in NLP. According to the study, the Bi-LSTM model could learn past and future input features in a sequence (<xref ref-type="bibr" rid="B15">Huang et&#x20;al., 2015</xref>). Next, <xref ref-type="bibr" rid="B37">Qiu et&#x20;al. (2018)</xref> built the DGeoSegmenter using a Bi-LSTM model, which derived words randomly and combined them into phrases. Similarly, <xref ref-type="bibr" rid="B11">Cui et&#x20;al. (2020)</xref> created a hybrid model of a temporal Bi-LSTM with a semantic gate, namely SG-BiTLSTM. The LSTM model was also developed to compare with the proposed model to identifying the landslide hazard-affected bodies (i.e.,&#x20;roads and buildings) in the images (<xref ref-type="bibr" rid="B11">Cui et&#x20;al., 2020</xref>). Results revealed that the SG-BiTLSTM model was better than the LSTM model to classify the landslide affected bodies and extract features of images. Furthermore, <xref ref-type="bibr" rid="B6">Behera et&#x20;al. (2021b)</xref> developed the Conv-LSTM to analyze consumer reviews posted on social media. An ensemble of CNN networks and a simple LSTM model was created for the sentiment classification of reviews posted across diverse domains. The experimental results showed that the Conv-LSTM outperformed other machine learning approaches in accuracy and other parameters. Furthermore, <xref ref-type="bibr" rid="B39">Singh et&#x20;al. (2020)</xref> also created a Conv-LSTM model for crowd monitoring in large-scale public events. In this research, five different LSTM models were also developed to compare the performance of the Conv-LSTM model (<xref ref-type="bibr" rid="B39">Singh et&#x20;al., 2020</xref>). Results showed that the Conv-LSTM performed best amongst other models. Besides, <xref ref-type="bibr" rid="B2">Barzegar et&#x20;al. (2020)</xref> created an ensemble CNN-LSTM model to forecast the water quality variables. In this experiment, the CNN-LSTM model was developed to compare with other ML models, namely, LSTM, CNN, SVR, and decision trees (DTs). Results revealed that the developed ensemble model performed better than the non-ensemble models (CNN, LSTM, DTs, and SVR). Similarly, <xref ref-type="bibr" rid="B49">Zhang et&#x20;al. (2019)</xref> also developed an ensemble CNN-LSTM model for the zonation of landslide hazards. The developed model was compared to other shallow ML models in this experiment. This experiment showed that CNN-LSTM was better than other shallow ML models.</p>
<p>In addition, the development of RNN ensemble techniques have shown some scope in the social network analysis, NLP, and similarity measures predictions (<xref ref-type="bibr" rid="B28">Lin et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B4">Behera et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B5">Behera et&#x20;al., 2021a</xref>; <xref ref-type="bibr" rid="B3">Behera et&#x20;al., 2019</xref>). For example, an ensemble of the RNN models could be used to predict the critical nodes in extensive networks (<xref ref-type="bibr" rid="B4">Behera et&#x20;al., 2020</xref>). Furthermore, <xref ref-type="bibr" rid="B5">Behera et&#x20;al. (2021a)</xref> predicted the missing link using the similarity measures, where similarity measures calculated the similarity between two links. Similarly, <xref ref-type="bibr" rid="B3">Behera et&#x20;al. (2019)</xref> used the similarity measures for community detection in large-scale networks based on similarity measures. An ensemble of RNN models could predict these similarity measures (<xref ref-type="bibr" rid="B28">Lin et&#x20;al., 2019</xref>).</p>
<p>We found that an ensemble of different RNN models has not been explored in the past for soil movement forecasting. However, an ensemble of the RNN model with EEMD or SVR models have been developed (<xref ref-type="bibr" rid="B44">Xing et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B18">Jiang et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B33">Niu et&#x20;al., 2021</xref>), or individual ML models like LSTM, GRU, SVR, SVM, DT, and RF (<xref ref-type="bibr" rid="B46">Xu and Niu, 2018</xref>; <xref ref-type="bibr" rid="B47">Yang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B29">Liu et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B31">Meng et&#x20;al., 2020</xref>) have been developed for soil movement forecasting. Furthermore, these developed ensembles and individual RNN models have been compared with individual ML models (LSTM, GRU, SVR, SVM, DT, and RF) to forecast soil movements. However, different variants of an ensemble of RNN models have not to be compared in the past. Moreover, it was found that some RNN models like the CNN-LSTM, Conv-LSTM, stacked LSTM, and the Bi-LSTM performed well in social network analysis and NLP problems. However, these models have not yet been developed for soil movement forecasting.</p>
<p>Overall, this paper&#x2019;s primary objective is to fill the literature gaps highlighted above by introducing a new RNN ensemble model called &#x201c;Bidirectional-Stacked-LSTM&#x201d; or &#x201c;BS-LSTM&#x201d; for forecasting soil movements at the Tangni and Kumarhatti sites in the Himalayan region of India. Furthermore, we develop CNN-LSTM, Conv-LSTM, Bi-LSTM, and stacked LSTM models and compare the performance of these models with the BS-LSTM to forecast soil movements. To the best of the authors&#x2019; knowledge, this type of study of soil movement forecasting has never been executed in the Chamoli and Solan districts in India. Thus, the main novelty in this work is to develop an ensemble of the RNN models for soil movement forecasting at new sites in the Himalayan region.</p>
</sec>
<sec id="s1-2">
<title>Study Area</title>
<p>The data for training RNN models were collected from the two landslide sites at Tangni and Kumarhatti (see <xref ref-type="fig" rid="F1">Figure&#x20;1A</xref>). The Tangni landslide is located in the Chamoli district, India. The landslide is at longitude 79&#xb0; 27&#x2032; 26.3&#x2033; E and latitude 30&#xb0; 27&#x2032; 54.3&#x2033; N, and at an elevation of 1,450&#xa0;m (<xref ref-type="fig" rid="F1">Figures 1A,B</xref>). As depicted in <xref ref-type="fig" rid="F1">Figure&#x20;1B</xref>, the study area is located on the National Highway 7, which connects Fazilka in Punjab with Mana Pass. This landslide is categorized into a rock-cum-debris slide (<xref ref-type="bibr" rid="B50">THDC, 2009</xref>). Furthermore, the Tangni site&#x2019;s incline is 30&#xb0; up the road level and 42&#xb0; beneath the road level. The surrounding area of the landslide is a forest consists oak and pinewood trees. Frequent occurrence of landslides was recorded in this area in 2013, which caused financial losses to the travel industry (<xref ref-type="bibr" rid="B17">IndiaNews, 2013</xref>). To monitor soil movements, inclinometer sensors were installed at Tangni site between 2012 and&#x20;2014.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>
<bold>(A)</bold> Locations of the study areas. <bold>(B)</bold> Borehole location of the Tangni site on Google Maps. <bold>(C)</bold> The landslide monitoring sensors installed on a hill near the railway track at Kumarhatti. <bold>(D)</bold> The Kumarhatti site&#x2019;s location on Google Maps.</p>
</caption>
<graphic xlink:href="feart-09-696792-g001.tif"/>
</fig>
<p>The Kumarhatti site is located in the Solan district, India, along the Kalka - Shimla railway track. The site is at longitude 77&#xb0; 02&#x2032; 50.0&#x2033; E and latitude 30&#xb0; 53&#x2032; 37.0&#x2033; N at an elevation of 1734&#xa0;m (<xref ref-type="fig" rid="F1">Figures 1C,D</xref>). The landslide debris often damaged the Kalka - Shimla railway line, which was recorded at the Kumarhatti site by the railway department (<xref ref-type="bibr" rid="B40">Surya, 2011</xref>; <xref ref-type="bibr" rid="B9">Chand, 2014</xref>). A low cost landslide monitoring system was setup in 2020 at the Kumarhatti site to detect soil movements (<xref ref-type="bibr" rid="B35">Pathania et&#x20;al., 2020</xref>; see <xref ref-type="fig" rid="F1">Figure&#x20;1C</xref>).</p>
<p>Soil movement data were collected daily between July 1st, 2012 and July 1st, 2014, from the sensors installed at the Tangni site (see <xref ref-type="fig" rid="F1">Figure&#x20;1B</xref>). Soil movement data (in meter) was recorded from the accelerometer sensors installed at the Kumarhatti site between September 10th, 2020, and June 17th, 2021. The accelerometer sensors were installed at 1&#xa0;m depth from the hill surface at the Kumarhatti&#x20;site.</p>
</sec>
</sec>
<sec sec-type="methods" id="s2">
<title>Methodology</title>
<sec id="s2-1">
<title>Data Preparation and Analysis</title>
<sec id="s2-1-1">
<title>Tangni Site</title>
<p>The soil movement time series of the Tangni site was collected from several inclinometer sensors. Twenty five inclinometer sensors (i.e.,&#x20;five sensors each at different depths in a borehole across five boreholes) were placed at the Tangni&#x20;site.</p>
<p>In each borehole, the first sensor was installed at 3&#xa0;m depth; the second sensor was installed at 6&#xa0;m; the third sensor was installed at 9&#xa0;m; the fourth sensor was installed at 12&#xa0;m; and, the fifth sensor was installed at 15&#xa0;m depth from the hill&#x2019;s surface. The sensor measured the inclination change in millimeters per meter (i.e.,&#x20;tilt angle). <xref ref-type="fig" rid="F2">Figure&#x20;2A</xref> depicts the working principal of the inclinometer sensor at the site. As illustrated in <xref ref-type="fig" rid="F2">Figures 2A</xref> if the an inclinometer&#x2019;s length is <inline-formula id="inf1">
<mml:math id="m1">
<mml:mi>L</mml:mi>
</mml:math>
</inline-formula> and the incline changes by (&#x3b8;), then the soil movement in <inline-formula id="inf2">
<mml:math id="m2">
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>&#x22c5;</mml:mo>
<mml:mi>sin</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>. As a result, we converted soil movement units into inclination <inline-formula id="inf3">
<mml:math id="m3">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>m</mml:mi>
<mml:mo>/</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> units, where one <inline-formula id="inf4">
<mml:math id="m4">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>m</mml:mi>
<mml:mo>/</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> unit equaled the 0.057&#xb0; movement in the&#x20;soil.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>
<bold>(A)</bold> The inclinometer sensor was installed in the borehole at the landslide location. <bold>(B)</bold> The analysis of the maximum soil movement of inclinometer installed at 12&#xa0;m depth near the failure&#x20;plane.</p>
</caption>
<graphic xlink:href="feart-09-696792-g002.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="F2">Figure&#x20;2A</xref>, the sensor has A and B axes, with positive and negative side on each. For example, in the A-axis, the <inline-formula id="inf5">
<mml:math id="m5">
<mml:mrow>
<mml:msup>
<mml:mi>A</mml:mi>
<mml:mo>&#x2b;</mml:mo>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> side measured the upside movement, and <inline-formula id="inf6">
<mml:math id="m6">
<mml:mrow>
<mml:msup>
<mml:mi>A</mml:mi>
<mml:mo>&#x2212;</mml:mo>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> side measured downside movement of the hill. The sensors were set up so that the positive A-axis was recording the upward movements and the negative A-axis was recording the downward movements towards the road&#x20;level.</p>
<p>First, we determined each sensor&#x2019;s relative tilt angle along the A-axis from its original value at installation time. Second, the sensors closest to the failure plane of the landslide were projected to yield the most significant tilt. As a result, if the sensor is nearby or close to the failure plane, the soil mass movement will be most&#x20;significant at this sensor (see <xref ref-type="fig" rid="F2">Figure&#x20;2B</xref>). We chose those sensors that revealed the maximum soil movement in each borehole over 2&#xa0;years. As shown in <xref ref-type="fig" rid="F2">Figures 2A,B</xref> sensor in borehole two at a depth of 12&#xa0;m showed the maximum soil movement over 2&#xa0;years.</p>
<p>Perhaps, this maximum soil movement was due to the sensor installed near the failure plane in borehole two at a depth of 12&#xa0;m. Overall, we took the sensor with the maximum average change in inclination from each borehole. As a result, the data set was reduced to five soil movement time series data (captured by a single sensor in each borehole).</p>
<p>Furthermore, there were no soil movements in the daily data. We summed the movement data over a 7&#xa0;days week to produce 78&#xa0;weeks of aggregated soil movement data. The weekly time series of five sensors (one in each borehole) were used to develop different RNN models.</p>
<p>
<xref ref-type="fig" rid="F3">Figure&#x20;3A&#x2013;E</xref> plots the soil movements (in&#x00B0;) over 78&#xa0;weeks in each of the five boreholes at Tangni. As illustrated in <xref ref-type="fig" rid="F3">Figures 3A&#x2013;E</xref>, upward and downward soil movements along the hill were represented by positive and negative tilt angles, respectively. For example, in <xref ref-type="fig" rid="F3">Figure&#x20;3A</xref>, in week 30, the movement was 1.71&#xb0; which changed to &#x2212;5.53&#xb0; in week 32. Therefore, there was a rapid downward soil movement of &#x2212;7.24&#xb0; in week 32. The soil movement data from borehole one to borehole five showed a consistent downward soil movement behavior (see <xref ref-type="fig" rid="F3">Figures 3A&#x2013;E</xref>). The borehole one continuously showed a downside soil movement from week one to week 31; but, it showed a considerable movement in week 32. Borehole five, which was installed near the crest, also showed movements between weeks 1 and 78; but, it showed significant movements in weeks 23 and weeks 60&#x2013;78. Boreholes two, three, and four, located between the crest and toe, detected small soil movement in the beginning and last&#x20;weeks.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>The plots of soil movement recorded from sensors deployed at different sites. <bold>(A)</bold> Sensor tilt at 3&#xa0;m in borehole one&#xa0;at Tangni. <bold>(B)</bold> Sensor tilt at 12&#xa0;m in borehole two at Tangni. <bold>(C)</bold> Sensor tilt at 6&#xa0;m in borehole three at Tangni. <bold>(D)</bold> Sensor tilt at 15&#xa0;m in borehole four at Tangni. <bold>(E)</bold> Sensor tilt at 15&#xa0;m in borehole five at Tangni. <bold>(F)</bold> Soil movements (in meters) per day at Kumarhatti.</p>
</caption>
<graphic xlink:href="feart-09-696792-g003.tif"/>
</fig>
</sec>
<sec id="s2-1-2">
<title>Kumarhatti Site</title>
<p>The landslide monitoring station at the Kumarhatti site has an accelerometer sensor installed at the 1-m depth from the soil&#x2019;s surface. The sensor has three orthogonal axes, X, Y, and Z, with positive and negative directions on each axis. For example, the X-axis has a positive (X&#x2b;) side measuring the downside hill movement and a negative (X&#x2212;) side measuring the upside hill movement. The sensor was installed with X-axis (X&#x2b;) parallel to the hill&#x2019;s slope and recorded the positive and negative movements. Every 10&#xa0;min, the accelerometer sensor recorded the acceleration due to gravity at the deployment. This acceleration was later converted into the soil movements (in meters) at the site by double integration using the trapezoidal rule. The Kumarhatti dataset has 36,000 soil movement points every 10&#xa0;min over 250&#xa0;days.</p>
<p>
<xref ref-type="fig" rid="F3">Figure&#x20;3F</xref> depicts the daily soil movement (in meters) over 250&#xa0;days for the Kumarhatti dataset. As illustrated in <xref ref-type="fig" rid="F3">Figure&#x20;3F</xref>, the positive slope of the soil movement in the graph represents the slope moving toward the railway track at the Kumarhatti&#x20;site.</p>
<p>The soil movement data from the Tangni and Kumarhatti sites were split into 80 and 20% ratios to train and test different models. For both sites, the developed LSTM models were first trained on the initial 80% training dataset, and they were later tested on the remaining 20% testing dataset. The Tangni dataset has 62 data points for training and 16 for testing. The Kumarhatti dataset has 32,800 data points for training and 7,200 data points for testing.</p>
</sec>
</sec>
<sec id="s2-2">
<title>Dataset Attributes</title>
<p>The Tangni and Kumarhatti datasets included four distinct attributes: the timestamp, the borehole number, the sensor depth, and the soil movement. For example, for Tangni dataset, [2, 2, 12, 1.8<bold>&#x00B0;</bold>] denotes that the sensor in borehole two at a depth of 12&#xa0;m recorded the soil movement of 1.8<bold>&#x00B0;</bold> in the upward direction during the second week since deployment. For Kumarhatti dataset, [20, 1, 1, 0.001&#xa0;m] denoted that the accelerometer at 1-m depth in borehole one showed the downward movement of 0.001&#xa0;m in the 200th minute since deployment.</p>
</sec>
<sec id="s2-3">
<title>Means, Standard Deviations, and Correlations in the Dataset</title>
<p>
<xref ref-type="table" rid="T1">Table&#x20;1</xref> displays the mean and standard deviation (SD) in the Tangni and Kumarhatti datasets. For example, soil movements in borehole one had an SD of 1.83&#xb0;. Furthermore, in <xref ref-type="table" rid="T1">Table&#x20;1</xref>, columns three through seven show the correlation value (<italic>r</italic>) of soil movements between different boreholes at Tangni. The correlation value between boreholes shows that if two boreholes are on the same failure plane of the landslide, both will simultaneously show soil movements with a high correlation. For example, borehole two was highly correlated with borehole four. As shown in <xref ref-type="fig" rid="F1">Figure&#x20;1B</xref>, both boreholes are nearby, and it could be possible that both were on the same failure plane. The mean value of the soil movement for the Kumarhatti data was 0.02&#xa0;m and had a standard deviation of 0.02&#xa0;m. In <xref ref-type="table" rid="T1">Table&#x20;1</xref>, the borehole is denoted by&#x20;BH.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Means, standard deviations, and the correlations between the time series of different boreholes.</p>
</caption>
<table>
<thead valign="top">
<tr>
<td align="left"/>
<td align="left"/>
<td align="center">Mean</td>
<td align="center">SD</td>
<td align="center">BH1-03&#xa0;m</td>
<td align="center">BH2-12&#xa0;m</td>
<td align="center">BH3-06&#xa0;m</td>
<td align="center">BH4-15&#xa0;m</td>
<td align="center">BH5-15&#xa0;m</td>
</tr>
</thead>
<tbody valign="top">
<tr>
<td rowspan="5" align="left">Tangni Data</td>
<td align="center">BH1-03&#xa0;m</td>
<td align="char" char=".">&#x2212;4.14<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">1.83<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">1</td>
<td align="left"/>
<td align="left"/>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="center">BH2-12&#xa0;m</td>
<td align="char" char=".">&#x2212;2.45<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">0.10<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">0.162</td>
<td align="char" char=".">1</td>
<td align="left"/>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="center">BH3-06&#xa0;m</td>
<td align="char" char=".">&#x2212;0.16<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">0.69<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">0.174</td>
<td align="char" char=".">0.026</td>
<td align="char" char=".">1</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="center">BH4-15&#xa0;m</td>
<td align="char" char=".">&#x2212;3.14<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">0.64<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">0.266</td>
<td align="char" char=".">0.552</td>
<td align="char" char=".">0.034</td>
<td align="char" char=".">1</td>
<td align="left"/>
</tr>
<tr>
<td align="center">BH5-15&#xa0;m</td>
<td align="char" char=".">&#x2212;1.74<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">1.50<bold>&#x00B0;</bold>
</td>
<td align="char" char=".">0.000</td>
<td align="char" char=".">0.142</td>
<td align="char" char=".">0.101</td>
<td align="char" char=".">0.223</td>
<td align="char" char=".">1</td>
</tr>
<tr>
<td align="left">Kumarhatti Data</td>
<td align="center">BH1-01&#xa0;m</td>
<td align="char" char=".">0.02&#xa0;m</td>
<td align="char" char=".">0.0&#xa0;m</td>
<td align="left"/>
<td align="left"/>
<td align="left"/>
<td align="left"/>
<td align="left"/>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s2-4">
<title>Autocorrelation Plots</title>
<p>The current values within the time series data may correlate to their previous values, which we call lags or look-back periods. The autocorrelation function (ACF) determines the number of lags present in a time series data. The partial autocorrelation function (PACF) determines a direct correlation between the current value and its lag values, removing the correlation of all other intermediate lags. As a result, the ACF value determines how many past forecasted errors are required to forecast the current value, and the PACF value determines how many past forecasted values are required to forecast the current value for the time series data. <xref ref-type="fig" rid="F4">Figure&#x20;4A</xref>-4&#xa0;L show the ACF and PACF plots with a 95% confidence interval for the Tangni and Kumarhatti datasets. As can from these figures, we can see that for the Tangni dataset, the time series one had an ACF value of five and a PACF value of one. The time series two had ACF and PACF values of zero, respectively. The time series three and four had ACF and PACF values of two, respectively. Furthermore, time series five had an ACF value of four and a PACF value of one. Similarly, the time series from the Kumarhatti dataset had an ACF value of forty-two and a PACF value of one. Based upon the range of ACF and PACF values across the two datasets, the look-back period for the developed models was varied from one to five for the Tangni data and one to forty-two for the Kumarhatti&#x20;data.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>The autocorrelation and partial autocorrelation plots with the 95% confidence interval. <bold>(A)</bold> Autocorrelation for sensor at 3&#xa0;m in borehole one at Tangni. <bold>(B)</bold> Autocorrelation for sensor at 12&#xa0;m in borehole two at Tangni. <bold>(C)</bold> Autocorrelation for the sensor at 6&#xa0;m in borehole three at Tangni. <bold>(D)</bold> Autocorrelation for the sensor at 15&#xa0;m in borehole four at Tangni. <bold>(E)</bold> Autocorrelation for the sensor at 15&#xa0;m in borehole five at Tangni. <bold>(F)</bold> Autocorrelations for the Kumarhatti data. <bold>(G)</bold> Partial autocorrelation for the sensor at 3&#xa0;m in borehole one&#xa0;at Tangni. <bold>(H)</bold> Partial autocorrelation for the sensor at 12&#xa0;m in borehole two at Tangni. <bold>(I)</bold> Partial autocorrelation for the sensor at 6&#xa0;m in borehole three at Tangni. <bold>(J)</bold> Partial autocorrelation for the sensor at 15&#xa0;m in borehole four at Tangni. <bold>(K)</bold> Partial autocorrelation for the sensor at 15&#xa0;m in borehole five at Tangni. <bold>(L)</bold> Partial autocorrelations for the Kumarhatti&#x20;data.</p>
</caption>
<graphic xlink:href="feart-09-696792-g004.tif"/>
</fig>
</sec>
<sec id="s2-5">
<title>Recurrent Neural Network Models</title>
<p>Recurrent neural networks (RNNs) are specially designed to discover dependencies between current and previous values in time series data (<xref ref-type="bibr" rid="B30">Medsker and Jain, 1999</xref>). RNN networks are composed of a chain of cells linked by a feedback loop, and these cells extract temporal information from time series data. In general, every cell in RNNs has a simple design, such as a tanh function (<xref ref-type="bibr" rid="B30">Medsker and Jain, 1999</xref>). In general, the RNN models suffer from the exploding and vanishing gradients problem during the training process (<xref ref-type="bibr" rid="B8">Bengio et&#x20;al., 1994</xref>). The problem arises when the long sequence of small or large values multiplies while calculating the gradient in the backpropagation. The exploding and vanishing gradient problems in the RNN model prevent the model from learning the long-term dependency in the&#x20;data.</p>
<p>LSTMs solve the exploding and vanishing gradient problems by employing a novel additive gradient structure in the backpropagation. The additive gradient structure includes direct access to the forget gate activations, allowing the network to update its parameters so that the gradient does not explode or vanish (<xref ref-type="bibr" rid="B14">Hochreiter and Schmidhuber, 1997</xref>). Thus, LSTMs solve the problem of vanishing gradient and the learning of longer-term dependencies in RNNs models.</p>
</sec>
<sec id="s2-6">
<title>Simple Long Short-Term Memory Model</title>
<p>The simple LSTM is a type of RNN model that can remember values from previous stages (<xref ref-type="bibr" rid="B30">Medsker and Jain, 1999</xref>). The cell state in an LSTM acts as a conveyor belt <inline-formula id="inf7">
<mml:math id="m7">
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>, allowing unaltered information to flow through the units with only a few linear interactions. The internal architecture of each LSTM unit has an input gate <inline-formula id="inf8">
<mml:math id="m8">
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>, forget gate <inline-formula id="inf9">
<mml:math id="m9">
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>, and output gate <inline-formula id="inf10">
<mml:math id="m10">
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>o</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. These three gates control the flow of information and avoid the exploding or vanishing gradient problems during training (<xref ref-type="bibr" rid="B14">Hochreiter and Schmidhuber, 1997</xref>) (see <xref ref-type="fig" rid="F5">Figure&#x20;5A</xref>). The input gate adds new information from the new input and previous output to the cell state. The forget gate determines what information is retained for a long time and what information is removed from the cell state. The forget gate uses the sigmoid (logistic) function, where the sigmoid function&#x2019;s output value is between zero and one. The forgot gate&#x2019;s output zero means to remove the information from the cell state, and output one means to keep the information. The purpose of the output gate is to determine what output value its required from the cell state, and the output gate also updates the previous state of the hidden state <inline-formula id="inf11">
<mml:math id="m11">
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. There is also a layer in the LSTM unit that contains the tanh activation function, which is used to update the state of neurons (see <xref ref-type="fig" rid="F5">Figure&#x20;5A</xref>). <xref ref-type="disp-formula" rid="e1">Eqs 1</xref>&#x2013;<xref ref-type="disp-formula" rid="e5">5</xref> are the fundamental equations of the LSTM cell, with the Hadamard product denoted by the letter &#x2032;<inline-formula id="inf12">
<mml:math id="m12">
<mml:mi>&#x3bf;</mml:mi>
</mml:math>
</inline-formula>&#x2032;:<disp-formula id="e1">
<mml:math id="m13">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>
<disp-formula id="e2">
<mml:math id="m14">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>f</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
<disp-formula id="e3">
<mml:math id="m15">
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>c</mml:mi>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:msub>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
<disp-formula id="e4">
<mml:math id="m16">
<mml:mrow>
<mml:msup>
<mml:mi>o</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>o</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>
<disp-formula id="e5">
<mml:math id="m17">
<mml:mrow>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mi>o</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>Where, <inline-formula id="inf13">
<mml:math id="m18">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the input gate at timestamp <italic>t</italic>; <inline-formula id="inf14">
<mml:math id="m19">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the forgot&#x20;gate at timestamp <italic>t</italic>; <inline-formula id="inf15">
<mml:math id="m20">
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the cell state at timestamp <italic>t</italic>;&#x20;<inline-formula id="inf16">
<mml:math id="m21">
<mml:mrow>
<mml:msup>
<mml:mi>o</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the output gate at timestamp <italic>t</italic>; and <inline-formula id="inf17">
<mml:math id="m22">
<mml:mrow>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the hidden state at&#x20;timestamp <italic>t</italic>. The variable <italic>x</italic>
<sub>
<italic>t</italic>
</sub> in the equations represents the input data sequence at timestamp <italic>t</italic>. The&#x20;matrices <inline-formula id="inf18">
<mml:math id="m23">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf19">
<mml:math id="m24">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf20">
<mml:math id="m25">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are the weight matrices between two different layers. For example, the <inline-formula id="inf21">
<mml:math id="m26">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the weight matrix between input data&#x20;sequence <italic>x</italic> and input gate <inline-formula id="inf22">
<mml:math id="m27">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>. Similarly, the <inline-formula id="inf23">
<mml:math id="m28">
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf24">
<mml:math id="m29">
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>f</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>o</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>and <inline-formula id="inf25">
<mml:math id="m30">
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are the biases for the input gate, forgot gate, output gate, and cell state, respectively. The <inline-formula id="inf26">
<mml:math id="m31">
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf27">
<mml:math id="m32">
<mml:mi>&#x3c3;</mml:mi>
</mml:math>
</inline-formula> here represent the tan hyperbolic and sigmoid (logistic) activation functions.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>
<bold>(A)</bold> A layer of the simple LSTM with three gates. <bold>(B)</bold> A Conv-LSTM model with a convolution operation on input and hidden state. <bold>(C)</bold> A CNN-LSTM with a CNN network at an input&#x20;layer.</p>
</caption>
<graphic xlink:href="feart-09-696792-g005.tif"/>
</fig>
</sec>
<sec id="s2-7">
<title>Convolutional Long Short-Term Memory Model</title>
<p>Conv-LSTM is a combination of convolution operation of the CNN model and the LSTM model (<xref ref-type="bibr" rid="B38">Shi et&#x20;al., 2015</xref>). As shown in <xref ref-type="fig" rid="F5">Figure&#x20;5B</xref>, the convolution operation is applied to the input and the hidden state of the LSTM cells. As a result, at each gate of the LSTM cell, the internal matrix multiplication process is changed by the convolution operation (&#x2a;). This operation can find the underlying spatial information in high-dimensional data. The Conv-LSTM&#x2019;s critical equations are provided in <xref ref-type="disp-formula" rid="e6">Eqs 6</xref>&#x2013;<xref ref-type="disp-formula" rid="e10">10</xref> below, where the convolution operation is denoted by &#x2032;&#x2a;&#x2032; and the Hadamard product is denoted by &#x2032;<inline-formula id="inf28">
<mml:math id="m33">
<mml:mi>&#x3bf;</mml:mi>
</mml:math>
</inline-formula>&#x2032;:<disp-formula id="e6">
<mml:math id="m34">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mtext>&#x2a;</mml:mtext>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>&#x2a;</mml:mtext>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(6)</label>
</disp-formula>
<disp-formula id="e7">
<mml:math id="m35">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mtext>&#x2a;</mml:mtext>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>&#x2a;</mml:mtext>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>f</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(7)</label>
</disp-formula>
<disp-formula id="e8">
<mml:math id="m36">
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mtext>&#x2a;</mml:mtext>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mtext>&#x2a;</mml:mtext>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(8)</label>
</disp-formula>
<disp-formula id="e9">
<mml:math id="m37">
<mml:mrow>
<mml:msup>
<mml:mi>o</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mtext>&#x2a;</mml:mtext>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mtext>&#x2a;</mml:mtext>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>o</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(9)</label>
</disp-formula>
<disp-formula id="e10">
<mml:math id="m38">
<mml:mrow>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mi>o</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(10)</label>
</disp-formula>Where, <inline-formula id="inf29">
<mml:math id="m39">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the input gate at timestamp <italic>t</italic>; <inline-formula id="inf30">
<mml:math id="m40">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the forgot gate at timestamp <italic>t</italic>; <inline-formula id="inf31">
<mml:math id="m41">
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the cell state at timestamp <italic>t</italic>; <inline-formula id="inf32">
<mml:math id="m42">
<mml:mrow>
<mml:msup>
<mml:mi>o</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is&#x20;the output gate at timestamp <italic>t</italic>; and <inline-formula id="inf33">
<mml:math id="m43">
<mml:mrow>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the hidden state at timestamp <italic>t</italic>. The variable <italic>x</italic>
<sub>
<italic>t</italic>
</sub> in the equations represents the input data sequence at timestamp <italic>t</italic>. The matrices <inline-formula id="inf34">
<mml:math id="m44">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf35">
<mml:math id="m45">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf36">
<mml:math id="m46">
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are the weight matrices between two different layers. Similarly, the <inline-formula id="inf37">
<mml:math id="m47">
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf38">
<mml:math id="m48">
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>f</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>o</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>and <inline-formula id="inf39">
<mml:math id="m49">
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are the biases for the input gate, forgot gate, output gate, and cell state, respectively. The <inline-formula id="inf40">
<mml:math id="m50">
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf41">
<mml:math id="m51">
<mml:mi>&#x3c3;</mml:mi>
</mml:math>
</inline-formula> here represent the tan hyperbolic and sigmoid (logistic) activation functions.</p>
</sec>
<sec id="s2-8">
<title>CNN-Long Short-Term Memory Model</title>
<p>The CNN-LSTM model is an ensemble of CNN and LSTM models (<xref ref-type="bibr" rid="B41">Wang et&#x20;al., 2016</xref>). As shown in <xref ref-type="fig" rid="F5">Figure&#x20;5C</xref>, the CNN model first searches for spatial information in high-dimensional input data and transforms it into one-dimensional data. The one-dimensional data is then fed as an input to the LSTM model. Here the CNN network acts as a spatial feature extractor.</p>
</sec>
<sec id="s2-9">
<title>Bidirectional Long Short-Term Memory (Bi-Long Short-Term Memory) Model</title>
<p>This model is mainly designed to improve the performance of the simple LSTM model. The Bi-LSTM model trains the two parallel LSTM layers simultaneously (<xref ref-type="bibr" rid="B12">Cui et&#x20;al., 2018</xref>). The models train&#x20;one of the parallel layers in the forward direction of the input data and another layer in the backward direction of the input data (see <xref ref-type="fig" rid="F6">Figure&#x20;6A</xref>). The Bi-LSTM model could learn more patterns from the input data than the simple LSTM model in this forward and backward training method. As the input data grows, the Bi-LSTM model identifies the unique pieces of information from the&#x20;data.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>
<bold>(A)</bold> The structure of the Bi-LSTM. <bold>(B)</bold> The structure of a two-layer stacked LSTM. <bold>(C)</bold> The structure of the BS-LSTM.</p>
</caption>
<graphic xlink:href="feart-09-696792-g006.tif"/>
</fig>
<p>
<xref ref-type="fig" rid="F6">Figure&#x20;6A</xref> depicts the Bi-LSTM design, which consists of two simple LSTM layers. One layer of the model trains the model forward, while a second layer of the LSTM trains the model backward. The parallel layers of the LSTM model receive the same input data and combine their outputs as one output. Finally, in order to forecast the output, the Bi-LSTM model is linked to a dense&#x20;layer.</p>
</sec>
<sec id="s2-10">
<title>Stacked Long Short-Term Memory Model</title>
<p>The stacked LSTM could be developed by stacking the two simple LSTM layers. The first layer receives the input from the input layer and provides the input to the next connected LSTM layer&#x20;(see <xref ref-type="fig" rid="F6">Figure&#x20;6B</xref>) (<xref ref-type="bibr" rid="B48">Yu et&#x20;al., 2019</xref>). Stacking multiple LSTM layers on top of each other allows the model to learn different temporal patterns from various timestamps in the input data. This design gives more power to the LSTM models to converge faster.</p>
<p>
<xref ref-type="fig" rid="F6">Figure&#x20;6B</xref> shows a stacked LSTM with two layers stacked on top of each other. The first layer of the stacked LSTM model is designed to take data from the input layer and process it before passing it to the next layer. The next layer is linked to the dense layer, which processes the output of the first layer before passing it to the dense layer. Finally, the dense layer forecasts the required outputs. The primary equations of the model, which update the model&#x2019;s initial layer, are as follows:<disp-formula id="e11">
<mml:math id="m52">
<mml:mrow>
<mml:msubsup>
<mml:mi>i</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msubsup>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(11)</label>
</disp-formula>
<disp-formula id="e12">
<mml:math id="m53">
<mml:mrow>
<mml:msubsup>
<mml:mi>f</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msubsup>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>f</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(12)</label>
</disp-formula>
<disp-formula id="e13">
<mml:math id="m54">
<mml:mrow>
<mml:msubsup>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mi>f</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>i</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>h</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>c</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(13)</label>
</disp-formula>
<disp-formula id="e14">
<mml:math id="m55">
<mml:mrow>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msubsup>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>o</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(14)</label>
</disp-formula>
<disp-formula id="e15">
<mml:math id="m56">
<mml:mrow>
<mml:msubsup>
<mml:mi>h</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mi>&#x3bf;</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>h</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3e;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(15)</label>
</disp-formula>Where, the variable <italic>x</italic>
<sub>
<italic>t</italic>
</sub> in the equations is representing the input&#x20;data sequence at timestamp <italic>t</italic>. The matrices <inline-formula id="inf42">
<mml:math id="m57">
<mml:mrow>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf43">
<mml:math id="m58">
<mml:mrow>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and<inline-formula id="inf44">
<mml:math id="m59">
<mml:mrow>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>W</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> are the weight matrices between two different layers at the level <italic>l</italic>. Similarly, the <inline-formula id="inf45">
<mml:math id="m60">
<mml:mrow>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf46">
<mml:math id="m61">
<mml:mrow>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>f</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>o</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>and <inline-formula id="inf47">
<mml:math id="m62">
<mml:mrow>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>c</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> are the biases at the level <italic>l</italic> for the input gate, forgot gate, output gate, and cell state, respectively.</p>
</sec>
<sec id="s2-11">
<title>Bidirectional Stacked Long Short-Term Memory Model</title>
<p>An ensembled version of a bidirectional LSTM and a stacked LSTM (called the BS-LSTM network) is a newly designed model for sequence forecasts. A bidirectional LSTM network is concatenated with a stacked LSTM network (see <xref ref-type="fig" rid="F6">Figure&#x20;6C</xref>). First, the bidirectional LSTM network is trained in both directions of the input time series (week one to week 62 and vice versa). Second, the output of the bidirectional LSTM layers is linked to a dense layer. Next, the dense layer&#x2019;s output is provided as input to the stacked LSTM layers. Finally, the stacked LSTM layers relate to a dense layer. The final dense layer forecast the required outputs.</p>
<p>As seen in <xref ref-type="fig" rid="F6">Figure&#x20;6C</xref>, seven layers make up the structure of the BS-LSTM model. The input layer was the first layer, and the following two parallel layers were trained in the forward and backward directions. Next, the output of the Bi-LSTM model was related to the dense layer. The dense layer provided the values to the input layer of the stacked LSTM. In the design of this stacked LSTM, two LSTM layers were stacked, one on top of the other. The output of the last stacked layer was related to the dense layer. Finally, the dense layer of the stacked LSTM forecasted the following week&#x2019;s soil movements.</p>
</sec>
<sec id="s2-12">
<title>Model Parameters Tuning</title>
<p>The first layer in different models was the input layer. The input layer&#x2019;s dimension could be determined by the features in the dataset, look-back periods, and batch size. The Tangni dataset had fewer data points than Kumarhatti. Thus, the batch size was selected as 16 for Tangni and 1,024 for Kumarhatti. The range of the look-back periods was estimated from the ACF and PACF values. As per the ACF and PACF values, the look-back period for the Tangni dataset was varied from one to five. Similarly, the look-back period for the Kumarhatti dataset was varied from one to forty-two. The number of the features in both datasets was three (i.e.,&#x20;borehole, depth, and soil movement). The second layer was the hidden layer. The nodes in the hidden layer of the LSTM models are called LSTM units. In this research, the one-step forecasting method has been used to forecast the soil movement at the next timestamp.</p>
</sec>
<sec id="s2-13">
<title>Simple Long Short-Term Memory Model</title>
<p>The number of LSTM units in the hidden layer was changed between 1 and 400 with a step size of 50. The hidden layer&#x2019;s output vector size was the same as the number of LSTM units in this layer. The dimension of the model&#x2019;s output could be changed by a dense layer. The dense layer consists of several neurons with&#x20;a linear activation function. The dense layer&#x2019;s output vector size was the same as the number of neurons in this layer. Thus, the dense layer with one neuron was connected to the last hidden layer. <xref ref-type="table" rid="T2">Table&#x20;2</xref> shows the various parameters used by this model to forecast the soil movement in the Tangni and Kumarhatti datasets.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Parameter optimization of different LSTM models.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th colspan="7" align="center">Model parameter ranges for the tangni dataset</th>
</tr>
<tr>
<th align="left">Parameters</th>
<th align="center">Convolutional LSTM</th>
<th align="center">CNN-LSTM</th>
<th align="center">Simple LSTM</th>
<th align="center">Bidirectional LSTM</th>
<th align="center">Stacked LSTM</th>
<th align="center">BS-LSTM</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Layers</td>
<td align="center">4</td>
<td align="center">6</td>
<td align="center">3</td>
<td align="center">3</td>
<td align="center">4</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">Lags or Look-back period</td>
<td align="center">Within 1 to 5</td>
<td align="center">Within 1 to 5</td>
<td align="center">Within 1 to 5</td>
<td align="center">Within 1 to 5</td>
<td align="center">Within 1 to 5</td>
<td align="center">Within 1 to 5</td>
</tr>
<tr>
<td align="left">Batch Size</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
</tr>
<tr>
<td align="left">Epochs</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
</tr>
<tr>
<td align="left">Filter Size in the Convolution Layer</td>
<td align="center">64</td>
<td align="center">64</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
</tr>
<tr>
<td align="left">Pool Size in the Convolution Layer</td>
<td align="center">Not Applicable</td>
<td align="center">2</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
</tr>
<tr>
<td align="left">Kernel Size in the Convolution Layer</td>
<td align="center">(1, 2)</td>
<td align="center">(1, 1)</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
</tr>
<tr>
<td align="left">LSTM Units in the Hidden layer</td>
<td align="center">Not Applicable</td>
<td align="center">Between 1 and 400, with step size 50</td>
<td align="center">Between 1 and 400, with step size 50</td>
<td align="center">Between 1 and 400, with step size 50</td>
<td align="center">Between 1 and 400, with step size 50</td>
<td align="center">Between 1 and 400, with step size 50</td>
</tr>
<tr>
<td align="left">Number of Neurons in the Dense Layer</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">Inputs Shuffling</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
</tr>
<tr>
<td align="left">Activation Functions</td>
<td align="center">Rectified Linear Unit</td>
<td align="center">Rectified Linear Unit</td>
<td align="center">Linear Activation Function</td>
<td align="center">Linear Activation Function</td>
<td align="center">Linear Activation Function</td>
<td align="center">Linear Activation Function</td>
</tr>
<tr>
<td align="left">Optimizer</td>
<td colspan="6" align="center">Adam</td>
</tr>
<tr>
<td colspan="7" align="center">
<bold>Model parameter ranges for the Kumarhatti dataset</bold>
</td>
</tr>
<tr>
<td align="center">
<bold>Parameters</bold>
</td>
<td align="center">
<bold>Convolutional LSTM</bold>
</td>
<td align="center">
<bold>CNN LSTM</bold>
</td>
<td align="center">
<bold>Simple LSTM</bold>
</td>
<td align="center">
<bold>Bidirectional LSTM</bold>
</td>
<td align="center">
<bold>Stacked LSTM</bold>
</td>
<td align="center">
<bold>BS-LSTM</bold>
</td>
</tr>
<tr>
<td align="left">&#x2003;Layers</td>
<td align="center">4</td>
<td align="center">6</td>
<td align="center">3</td>
<td align="center">3</td>
<td align="center">4</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">&#x2003;Lags or Look-back period</td>
<td align="center">Within 1 to 42</td>
<td align="center">Within 1 to 42</td>
<td align="center">Within 1 to 42</td>
<td align="center">Within 1 to 42</td>
<td align="center">Within 1 to 42</td>
<td align="center">Within 1 to 42</td>
</tr>
<tr>
<td align="left">&#x2003;Batch Size</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
</tr>
<tr>
<td align="left">&#x2003;Epochs</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
<td align="center">10 or 50</td>
</tr>
<tr>
<td align="left">&#x2003;Filter Size in the Convolution Layer</td>
<td align="center">64</td>
<td align="center">64</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
</tr>
<tr>
<td align="left">&#x2003;Pool Size in the Convolution Layer</td>
<td align="center">Not Applicable</td>
<td align="center">2</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
</tr>
<tr>
<td align="left">&#x2003;Kernel Size in the Convolution Layer</td>
<td align="center">(1, 2)</td>
<td align="center">(1, 1)</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
<td align="center">Not Applicable</td>
</tr>
<tr>
<td align="left">&#x2003;LSTM Units in the Hidden layer</td>
<td align="center">Not Applicable</td>
<td align="center">Between 1 and 500, with step 10</td>
<td align="center">Between 1 and 500, with step 10</td>
<td align="center">Between 1 and 500, with step 10</td>
<td align="center">Between 1 and 500, with step 10</td>
<td align="center">Between 1 and 500, with step 10</td>
</tr>
<tr>
<td align="left">&#x2003;Number of Neurons in the Dense Layer</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">&#x2003;Inputs Shuffling</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
<td align="center">Yes/No</td>
</tr>
<tr>
<td align="left">&#x2003;Activation Function</td>
<td align="center">Rectified Linear Unit</td>
<td align="center">Rectified Linear Unit</td>
<td align="center">Linear Activation Function</td>
<td align="center">Linear Activation Function</td>
<td align="center">Linear Activation Function</td>
<td align="center">Linear Activation Function</td>
</tr>
<tr>
<td align="left">&#x2003;Optimizer</td>
<td colspan="6" align="center">Adam</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s2-14">
<title>Conv-Long Short-Term Memory Model</title>
<p>In this model, the output dimension of the convolution layer was determined by the batch size, new number of rows, new number of columns, and the number of filters. In the convolution layer, the batch size was selected as 16 for Tangni and 1,024 for Kumarhatti. The new number of rows and columns were set as one and two, respectively. The number of filters was selected as 62. The kernel size of the convolution layer was set to <inline-formula id="inf48">
<mml:math id="m63">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>. The output vector dimension of the convolution layer was multidimensional, so a flatten layer was added after the convolution layer. The flatten layer converted the multidimensional data into one dimension. The dimension of the model&#x2019;s output could be changed by a dense layer. The dense layer consists of several neurons with a rectified linear unit (ReLU) activation function. The dense layer&#x2019;s output vector size was the same as the number of neurons in this layer. Thus, the dense layer with one neuron was connected to the last hidden layer. <xref ref-type="table" rid="T2">Table&#x20;2</xref> shows the various parameters used by this model to forecast the soil movements in the Tangni and Kumarhatti datasets.</p>
</sec>
<sec id="s2-15">
<title>CNN-Long Short-Term Memory Model</title>
<p>In this model, the time distributed convolution layer executed the same convolution operation to each timestamp as the LSTMs unrolled. The output dimension of the convolution layer was determined by the batch size, new number of rows, new number of columns, and the number of filters. In the convolution layer, both new rows and columns were set as one. The number of filters was selected as 62. The kernel size of the convolution layer was set to <inline-formula id="inf49">
<mml:math id="m64">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>. A max-pooling layer was used to reduce the spatial dimensions. The pool size of the max-pooling layer was set at 2. The max-pooling layers&#x2019; output was multidimensional, so a flatten layer was added after the max-pooling layer. The&#x20;flattening layer converted the multidimensional data into one dimension. The flatten layer provided the one-dimensional data to the next hidden layer. The next layer was the hidden layer. The nodes in the hidden layer are called LSTM units. The number of LSTM units in the hidden layer was changed between 1 and 400 with a step size of 50. The hidden layer&#x2019;s output vector size was the same as the number of LSTM units in this layer. The dimension of the model&#x2019;s output could be changed by a dense layer. The dense layer consists of several neurons with a rectified linear unit (ReLU) activation function. The dense layer&#x2019;s output vector size was the same as the number of neurons in this layer. Thus, the dense layer with one neuron was connected to the last hidden layer. <xref ref-type="table" rid="T2">Table&#x20;2</xref> shows the various parameters used by this model to forecast the soil movements in the Tangni and Kumarhatti datasets.</p>
</sec>
<sec id="s2-16">
<title>Bi-Long Short-Term Memory Model</title>
<p>In this model, the nodes in the hidden layer was called LSTM units. The number of LSTM units in both hidden layers was changed simultaneously between 1 and 400 with a step size of 50. The hidden layer&#x2019;s output vector size was the same as the number of LSTM units in this layer. The dimension of the model&#x2019;s output could be changed by a dense layer. The dense layer consists of several neurons with a linear activation function. The dense layer&#x2019;s output vector size was the same as the number of neurons in this layer. In this research, the one-look-ahead forecasting method has been used. Thus, the dense layer with one neuron was connected to the last hidden layer. <xref ref-type="table" rid="T2">Table&#x20;2</xref> shows the various parameters used by this model to forecast the soil movements in the Tangni and Kumarhatti datasets.</p>
</sec>
<sec id="s2-17">
<title>Stacked Long Short-Term Memory Model</title>
<p>In this model, the number of LSTM units in the hidden layer was changed between 1 and 400 with a step size of 50. The hidden layer&#x2019;s output vector size was the same as the number of LSTM units in this layer. The dimension of the model&#x2019;s output could be changed by a dense layer. The dense layer consists of several neurons with a linear activation function. The dense layer&#x2019;s output vector size was the same as the number of neurons in this layer. Thus, the dense layer with one neuron was connected to the last hidden layer. <xref ref-type="table" rid="T2">Table&#x20;2</xref> shows the various parameters used by this model to forecast the soil movements in the Tangni and Kumarhatti datasets.</p>
</sec>
<sec id="s2-18">
<title>BS-Long Short-Term Memory Model</title>
<p>This model has seven layers, three in the Bi-LSTM model (Input: one; Hidden: one; Output:1) and four in the stacked LSTM model (Input: one; Stacked: two; Output:1). The architecture of the Bi-LSTM and stacked LSTM layers was the same as Bi-LSTM and stacked LSTM model developed in this paper. <xref ref-type="table" rid="T2">Table&#x20;2</xref> shows the various parameter values used by this model to forecast the soil movement in the Tangni and Kumarhatti datasets. For example, the BS-LSTM parameters for the Tangni dataset were varied as per the following: the batch size was fixed at 16; the LSTM units in the hidden layers were selected between 1 and 400 with a step size of 50 for the Bi-LSTM model and changed between 1 and 400 with a step size of 50 for both stacked layers in stacked LSTM; the number of epochs was changed as 10 or 50; the look-back period was varied from one to five; and, the inputs were passed with shuffling turned on or turned off. The BS-LSTM parameters for the Kumarhatti dataset were varied as per the following: the batch size was fixed at 1,024; the hidden layer&#x2019;s LSTM units in Bi-LSTM and stacked LSTM were varied between 1 and 500 with a step size of 10; the number of epochs was changed as 10 or 50; the look-back period was varied between 1 and 42 (as estimated from ACF and PACF); and, the inputs were passed with shuffling and without shuffling.</p>
</sec>
<sec id="s2-19">
<title>Model&#x2019;s Inputs and Outputs</title>
<p>Before entering the time series data into models, we divided the input data into several packets based on borehole and depth. Packets from the first 80% of data were utilized to training the models, and packets from the last 20% of data were used for testing the models. Each packet was a combination of X, Y, where X was the predictor, and Y was the forecasted soil movement value (see <xref ref-type="fig" rid="F7">Figure&#x20;7</xref>). As shown in <xref ref-type="fig" rid="F7">Figure&#x20;7</xref>, the X predictor formed a movement vector consisting of soil movements recorded by a sensor in a borehole in a timestamp (M), borehole number of the sensor (Borehole), and the depth of the sensor (Depth). The M was the timestamp value of the soil movement recorded by a sensor in a borehole in a particular time. The Borehole value was one of the five boreholes at the site. The Depth value was the depth of the sensor in a specific borehole. For example, for a look-back period of three in the Tangni dataset, packet X<sub>1</sub> may contain the movement recorded by the sensor in the first 3&#xa0;weeks in borehole one at a depth of 3&#xa0;m. The corresponding Y<sub>1</sub> contained the actual value of the movement recorded by the same sensor in borehole one in week four, forecasted by a model. Thus, there was a one-look-ahead soil movement forecast (Y) for a certain look-back period and for a particular sensor at a specific depth (X) (the look-back was passed to models as a parameter). These X and Y packets could be shuffled before input to a model, where the shuffle operation would shuffle the two lists (i.e.,&#x20;X and Y) in the same order (see <xref ref-type="fig" rid="F7">Figure&#x20;7</xref>).</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Inputs and outputs in a&#x20;model.</p>
</caption>
<graphic xlink:href="feart-09-696792-g007.tif"/>
</fig>
</sec>
<sec id="s2-20">
<title>Dropouts in Models</title>
<p>The Tangni dataset has a limited amount of data, where the training dataset has only 62 data points to train LSTM models. When the training dataset is limited, the system can overfit the model&#x2019;s parameters during training. The dropouts can be applied to the layers of the LSTM to prevent overfitting (<xref ref-type="bibr" rid="B36">Pham et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B13">Gal and Ghahramani, 2016</xref>). Different combinations with the probability (p) of dropout were applied on the input-output and recurrent connections of the LSTM layer. The probability value (<italic>p</italic>) was varied between 0.0 and 0.8 with step size 0.2, where a <italic>p</italic>-value of 0.0 means no dropout applied. For example, a combination (0.2, 0.8) represents 20% dropout applied to the LSTM unit&#x2019;s output and 80% dropout applied on the recurrent input of the LSTM unit (<xref ref-type="bibr" rid="B13">Gal and Ghahramani, 2016</xref>).</p>
</sec>
<sec id="s2-21">
<title>Performance Measure for the Models</title>
<p>The soil movement forecasting is a regression problem, where the resulting soil movement is assumed to have a floating value. Thus, an error can be calculated between the true value and the forecasted value of the soil movements. The different performance measures have been used to evaluate the performance measure of the various models (<xref ref-type="bibr" rid="B7">Behera et&#x20;al., 2018</xref>). In this paper, four performance measures are used: mean relative error (MRE), root mean square error (RMSE), normalized root mean squared error (NRMSE), and mean absolute error (MAE). <xref ref-type="disp-formula" rid="e16">Eqs 16</xref>&#x2013;<xref ref-type="disp-formula" rid="e19">19</xref> were used to calculate the values of these measures to find the difference between actual points of datasets from the forecasted points of datasets.<disp-formula id="e16">
<mml:math id="m65">
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:mi>M</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>E</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>e</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<label>(16)</label>
</disp-formula>
<disp-formula id="e17">
<mml:math id="m66">
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mi>R</mml:mi>
<mml:mi>M</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>E</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3a3;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>e</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3a3;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(17)</label>
</disp-formula>
<disp-formula id="e18">
<mml:math id="m67">
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mi>A</mml:mi>
<mml:mi>E</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>e</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(18)</label>
</disp-formula>
<disp-formula id="e19">
<mml:math id="m68">
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mi>R</mml:mi>
<mml:mi>E</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>e</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mtext>&#x2a;</mml:mtext>
<mml:mn>100</mml:mn>
</mml:mrow>
</mml:math>
<label>(19)</label>
</disp-formula>where, <italic>n</italic> denotes the number of data points in the Tangni or Kumarhatti dataset, the true angle denotes the actual observed value of the soil movements in the dataset, and the forecasted angle denotes the forecasted value of the soil movements by the&#x20;model.</p>
</sec>
<sec id="s2-22">
<title>Model Calibration</title>
<p>We created a grid search procedure to calibrate the parameters in various models. In this process, we changed the different set of parameters (described in <xref ref-type="table" rid="T2">Table&#x20;2</xref>) in a LSTM model. After feeding a combination into the LSTM models, we recorded the MAE, RMSE, NRMSE, and MRE, and we chose the parameters with the lowest error in the&#x20;model.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<p>The developed LSTM models first trained on the first 80% data and then tested on the remaining 20% data. The training and&#x20;testing results of these models across the Tangni and Kumarhatti datasets are reported in <xref ref-type="table" rid="T3">Table&#x20;3</xref>. The results in <xref ref-type="table" rid="T3">Table&#x20;3</xref> are sorted according to the model&#x2019;s performance (minimum RMSE first) on the testing dataset. As can see in <xref ref-type="table" rid="T3">Table&#x20;3</xref>, the BS-LSTM and Bi-LSTM models outperformed the other models in training and testing across the Tangni and Kumarhatti datasets.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Errors of various models in the training and testing dataset.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left">Models</th>
<th rowspan="2" align="center">Borehole</th>
<th colspan="4" align="center">Training</th>
<th colspan="4" align="center">Testing</th>
</tr>
<tr>
<th align="center">MAE</th>
<th align="center">RMSE</th>
<th align="center">NRMSE</th>
<th align="center">MRE</th>
<th align="center">MAE</th>
<th align="center">RMSE</th>
<th align="center">NRMSE</th>
<th align="center">MRE</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td colspan="10" align="center">
<bold>Performance of models on the Tangni dataset</bold>
</td>
</tr>
<tr>
<td rowspan="6" align="left">BS-LSTM</td>
<td align="center">1&#x2013;03&#xa0;m</td>
<td align="char" char=".">0.22</td>
<td align="char" char=".">0.39</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">23.29</td>
<td align="char" char=".">0.15</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">2.63</td>
</tr>
<tr>
<td align="center">2&#x2013;12&#xa0;m</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.23</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.00</td>
</tr>
<tr>
<td align="center">3&#x2013;06&#xa0;m</td>
<td align="char" char=".">0.07</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">0.07</td>
<td align="char" char=".">19.19</td>
<td align="char" char=".">0.25</td>
<td align="char" char=".">0.58</td>
<td align="char" char=".">0.24</td>
<td align="char" char=".">66.42</td>
</tr>
<tr>
<td align="center">4&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">2.56</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.04</td>
</tr>
<tr>
<td align="center">5&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.29</td>
<td align="char" char=".">0.46</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">23.32</td>
<td align="char" char=".">0.34</td>
<td align="char" char=".">0.54</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">38.26</td>
</tr>
<tr>
<td align="center">Average</td>
<td align="char" char=".">0.13</td>
<td align="char" char=".">0.23</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">13.92</td>
<td align="char" char=".">0.16</td>
<td align="char" char=".">0.27</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">16.70</td>
</tr>
<tr>
<td rowspan="6" align="left">Bidirectional LSTM</td>
<td align="center">1&#x2013;03&#xa0;m</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">0.25</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">4.83</td>
<td align="char" char=".">0.04</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.77</td>
</tr>
<tr>
<td align="center">2&#x2013;12&#xa0;m</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.73</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.52</td>
</tr>
<tr>
<td align="center">3&#x2013;06&#xa0;m</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">9.31</td>
<td align="char" char=".">209.11</td>
<td align="char" char=".">0.36</td>
<td align="char" char=".">0.74</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">238.46</td>
</tr>
<tr>
<td align="center">4&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.16</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.72</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.04</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.83</td>
</tr>
<tr>
<td align="center">5&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.40</td>
<td align="char" char=".">0.69</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">686.76</td>
<td align="char" char=".">0.46</td>
<td align="char" char=".">0.67</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">256.24</td>
</tr>
<tr>
<td align="center">Average</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">0.23</td>
<td align="char" char=".">1.87</td>
<td align="char" char=".">180.63</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">0.30</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">99.36</td>
</tr>
<tr>
<td rowspan="6" align="left">Stacked LSTM</td>
<td align="center">1&#x2013;03&#xa0;m</td>
<td align="char" char=".">0.20</td>
<td align="char" char=".">0.34</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">9.17</td>
<td align="char" char=".">0.16</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">2.74</td>
</tr>
<tr>
<td align="center">2&#x2013;12&#xa0;m</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.19</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.93</td>
</tr>
<tr>
<td align="center">3&#x2013;06&#xa0;m</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">1.17</td>
<td align="char" char=".">99.12</td>
<td align="char" char=".">0.40</td>
<td align="char" char=".">0.75</td>
<td align="char" char=".">0.17</td>
<td align="char" char=".">133.96</td>
</tr>
<tr>
<td align="center">4&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">0.17</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">3.03</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.51</td>
</tr>
<tr>
<td align="center">5&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.44</td>
<td align="char" char=".">0.73</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">95.59</td>
<td align="char" char=".">0.40</td>
<td align="char" char=".">0.58</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">145.64</td>
</tr>
<tr>
<td align="center">Average</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">0.28</td>
<td align="char" char=".">0.24</td>
<td align="char" char=".">41.62</td>
<td align="char" char=".">0.21</td>
<td align="char" char=".">0.32</td>
<td align="char" char=".">0.04</td>
<td align="char" char=".">56.96</td>
</tr>
<tr>
<td rowspan="6" align="left">CNN-LSTM</td>
<td align="center">1&#x2013;03&#xa0;m</td>
<td align="char" char=".">0.27</td>
<td align="char" char=".">0.79</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">48.62</td>
<td align="char" char=".">0.10</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.81</td>
</tr>
<tr>
<td align="center">2&#x2013;12&#xa0;m</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.95</td>
<td align="char" char=".">0.04</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.56</td>
</tr>
<tr>
<td align="center">3&#x2013;06&#xa0;m</td>
<td align="char" char=".">0.10</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">1.08</td>
<td align="char" char=".">87.81</td>
<td align="char" char=".">0.32</td>
<td align="char" char=".">0.66</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">86.81</td>
</tr>
<tr>
<td align="center">4&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">0.30</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">6.13</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">0.10</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">2.32</td>
</tr>
<tr>
<td align="center">5&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.38</td>
<td align="char" char=".">0.76</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">57.83</td>
<td align="char" char=".">0.50</td>
<td align="char" char=".">0.74</td>
<td align="char" char=".">0.07</td>
<td align="char" char=".">243.92</td>
</tr>
<tr>
<td align="center">Average</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">0.40</td>
<td align="char" char=".">0.22</td>
<td align="char" char=".">40.47</td>
<td align="char" char=".">0.21</td>
<td align="char" char=".">0.33</td>
<td align="char" char=".">0.04</td>
<td align="char" char=".">67.28</td>
</tr>
<tr>
<td rowspan="6" align="left">Simple LSTM</td>
<td align="center">1&#x2013;03&#xa0;m</td>
<td align="char" char=".">0.19</td>
<td align="char" char=".">0.35</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">23.42</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">0.07</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.10</td>
</tr>
<tr>
<td align="center">2&#x2013;12&#xa0;m</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">0.10</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">4.01</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">3.26</td>
</tr>
<tr>
<td align="center">3&#x2013;06&#xa0;m</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">0.13</td>
<td align="char" char=".">1.15</td>
<td align="char" char=".">99.13</td>
<td align="char" char=".">0.44</td>
<td align="char" char=".">0.81</td>
<td align="char" char=".">0.22</td>
<td align="char" char=".">146.67</td>
</tr>
<tr>
<td align="center">4&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.16</td>
<td align="char" char=".">0.28</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">6.27</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">2.72</td>
</tr>
<tr>
<td align="center">5&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.48</td>
<td align="char" char=".">0.73</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">59.11</td>
<td align="char" char=".">0.51</td>
<td align="char" char=".">0.77</td>
<td align="char" char=".">0.07</td>
<td align="char" char=".">236.76</td>
</tr>
<tr>
<td align="center">Average</td>
<td align="char" char=".">0.21</td>
<td align="char" char=".">0.32</td>
<td align="char" char=".">0.23</td>
<td align="char" char=".">38.39</td>
<td align="char" char=".">0.24</td>
<td align="char" char=".">0.37</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">78.10</td>
</tr>
<tr>
<td rowspan="6" align="left">Convolutional LSTM</td>
<td align="center">1&#x2013;03&#xa0;m</td>
<td align="char" char=".">0.29</td>
<td align="char" char=".">0.78</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">608.79</td>
<td align="char" char=".">0.04</td>
<td align="char" char=".">0.04</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.76</td>
</tr>
<tr>
<td align="center">2&#x2013;12&#xa0;m</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">4.47</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">0.10</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">4.02</td>
</tr>
<tr>
<td align="center">3&#x2013;06&#xa0;m</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">54.84</td>
<td align="char" char=".">1,111.34</td>
<td align="char" char=".">0.36</td>
<td align="char" char=".">0.76</td>
<td align="char" char=".">0.19</td>
<td align="char" char=".">646.81</td>
</tr>
<tr>
<td align="center">4&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.23</td>
<td align="char" char=".">0.32</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">9.46</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">0.20</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">5.68</td>
</tr>
<tr>
<td align="center">5&#x2013;15&#xa0;m</td>
<td align="char" char=".">0.52</td>
<td align="char" char=".">0.79</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">138.42</td>
<td align="char" char=".">0.76</td>
<td align="char" char=".">1.06</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">477.23</td>
</tr>
<tr>
<td align="center">Average</td>
<td align="char" char=".">0.23</td>
<td align="char" char=".">0.40</td>
<td align="char" char=".">10.97</td>
<td align="char" char=".">374.50</td>
<td align="char" char=".">0.29</td>
<td align="char" char=".">0.43</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">226.90</td>
</tr>
<tr>
<td colspan="10" align="center">
<bold>Performance of models on the Kumarhatti dataset</bold>
</td>
</tr>
<tr>
<td align="left">BS-LSTM</td>
<td align="char" char="ndash">1&#x2013;1&#xa0;m</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.44</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.01</td>
</tr>
<tr>
<td align="left">Bidirectional LSTM</td>
<td align="char" char="ndash">1&#x2013;1&#xa0;m</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.63</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.01</td>
</tr>
<tr>
<td align="left">Simple LSTM</td>
<td align="char" char="ndash">1&#x2013;1&#xa0;m</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.69</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.19</td>
</tr>
<tr>
<td align="left">Convolutional LSTM</td>
<td align="char" char="ndash">1&#x2013;1&#xa0;m</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">1.93</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.13</td>
</tr>
<tr>
<td align="left">Stacked LSTM</td>
<td align="char" char="ndash">1&#x2013;1&#xa0;m</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">12.70</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">13.63</td>
</tr>
<tr>
<td align="left">CNN-LSTM</td>
<td align="char" char="ndash">1&#x2013;1&#xa0;m</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">0.39</td>
<td align="char" char=".">87.82</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">1.15</td>
<td align="char" char=".">131.58</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>
<xref ref-type="table" rid="T4">Table&#x20;4</xref> introduces the optimized value of parameters for all models. As can see in <xref ref-type="table" rid="T4">Table&#x20;4</xref>, for the Tangni landslide dataset, the least error of the Bi-LSTM model was at the look-back period &#x3d; 4, number of epochs &#x3d; 50, batch size &#x3d; 16, LSTM units in the hidden layer &#x3d; 400, with the shuffling of inputs, and no dropouts applied on the LSTM layer. <xref ref-type="table" rid="T4">Table&#x20;4</xref> shows the optimized values for the BS-LSTM model, where the BS-LSTM model was an ensemble version of two different recurrent models, Bi-LSTM and stacked LSTM. The Bi-LSTM was trained first. The best set of Bi-LSTM parameters that minimized the RMSE on the Tangni dataset included: look-back periods &#x3d; 4, shuffling turned on, LSTM units in the hidden layer &#x3d; 400, the number of epochs &#x3d; 50, batch size &#x3d; 16, and no dropouts applied on the LSTM layer. Next, the trained Bi-LSTM model&#x2019;s output forecasted values were fed into a stacked LSTM as an input. The stacked LSTM was trained next, where the minimum RMSE was produced with the following parameters: batch size &#x3d; 4, number of epochs &#x3d; 50, packets without shuffled, no dropouts, number of nodes in input layer &#x3d; 12, number of LSTM units in first stacked layer &#x3d; 200, second stacked layer &#x3d; 100, and number of neurons in the dense layer &#x3d; 1. For the Tangni dataset, most LSTM models had a look-back period of four, as shown in <xref ref-type="table" rid="T4">Table&#x20;4</xref>. Furthermore, the Kumarhatti dataset had a PACF value of one, implying that this time series was only required a single look-back period. As shown in <xref ref-type="table" rid="T4">Table&#x20;4</xref>, most LSTM models had a one-look-back period on the Kumarhatti dataset. The models&#x2019; parameters optimization found that the look-back period was the most critical parameter to reduce&#x20;error.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Optimized parameters for the different models on the Tangni and Kumarhatti datasets.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th colspan="7" align="center">Optimal parameters of the different models on the Tangni dataset</th>
</tr>
<tr>
<th align="left">Parameters</th>
<th align="center">Convolutional LSTM</th>
<th align="center">CNN-LSTM</th>
<th align="center">Simple LSTM</th>
<th align="center">Bidirectional LSTM</th>
<th align="center">Stacked LSTM</th>
<th align="center">BS-LSTM</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Layers</td>
<td align="center">4</td>
<td align="center">6</td>
<td align="center">3</td>
<td align="center">3</td>
<td align="center">4</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">Lags or Look-back period</td>
<td align="center">2</td>
<td align="center">4</td>
<td align="center">4</td>
<td align="center">4</td>
<td align="center">4</td>
<td align="center">4</td>
</tr>
<tr>
<td align="left">Epochs</td>
<td align="center">10</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
</tr>
<tr>
<td align="left">Batch Size</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
</tr>
<tr>
<td align="left">LSTM Units in the Hidden Layer</td>
<td align="center">Not Applicable</td>
<td align="center">50</td>
<td align="center">200</td>
<td align="center">400</td>
<td align="center">200, 100</td>
<td align="center">Bi-LSTM (400), Stacked LSTM (200, 100)</td>
</tr>
<tr>
<td align="left">Number of Neurons in the Dense Layer</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">Inputs Shuffling</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">Bi-LSTM (Yes), Stacked LSTM (No)</td>
</tr>
<tr>
<td align="left">Dropout at Input Layer</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
</tr>
<tr>
<td align="left">Dropout at Dense Layer</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
</tr>
<tr>
<td colspan="7" align="center">
<bold>Optimal parameters of the different models on the Kumarhatti dataset</bold>
</td>
</tr>
<tr>
<td align="left">
<bold>Parameters</bold>
</td>
<td align="center">
<bold>Convolutional LSTM</bold>
</td>
<td align="center">
<bold>CNN-LSTM</bold>
</td>
<td align="center">
<bold>Simple LSTM</bold>
</td>
<td align="center">
<bold>Bidirectional LSTM</bold>
</td>
<td align="center">
<bold>Stacked LSTM</bold>
</td>
<td align="center">
<bold>BS-LSTM</bold>
</td>
</tr>
<tr>
<td align="left">&#x2003;Layers</td>
<td align="center">4</td>
<td align="center">6</td>
<td align="center">3</td>
<td align="center">3</td>
<td align="center">4</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">&#x2003;Lags or Look-back period</td>
<td align="center">25</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">&#x2003;Epochs</td>
<td align="center">10</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
</tr>
<tr>
<td align="left">&#x2003;Batch Size</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
</tr>
<tr>
<td align="left">&#x2003;LSTM Units in the Hidden Layer</td>
<td align="center">Not Applicable</td>
<td align="center">300</td>
<td align="center">240</td>
<td align="center">410</td>
<td align="center">160, 330</td>
<td align="center">Bi-LSTM (50), Stacked LSTM (250, 300)</td>
</tr>
<tr>
<td align="left">&#x2003;Number of Neurons in the Dense Layer</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">&#x2003;Inputs Shuffling</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Bi-LSTM (Yes), Stacked LSTM (Yes)</td>
</tr>
<tr>
<td align="left">&#x2003;Dropout at Input Layer</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
</tr>
<tr>
<td align="left">&#x2003;Dropout at Dense Layer</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
</tr>
<tr>
<td colspan="7" align="center">
<bold>Optimal parameters of the different models on the Tangni dataset</bold>
</td>
</tr>
<tr>
<td align="left">
<bold>Parameters</bold>
</td>
<td align="center">
<bold>Convolutional LSTM</bold>
</td>
<td align="center">
<bold>CNN-LSTM</bold>
</td>
<td align="center">
<bold>Simple LSTM</bold>
</td>
<td align="center">
<bold>Bidirectional LSTM</bold>
</td>
<td align="center">
<bold>Stacked LSTM</bold>
</td>
<td align="center">
<bold>BS-LSTM</bold>
</td>
</tr>
<tr>
<td align="left">&#x2003;Layers</td>
<td align="center">4</td>
<td align="center">6</td>
<td align="center">3</td>
<td align="center">3</td>
<td align="center">4</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">&#x2003;Lags or Look-back period</td>
<td align="center">2</td>
<td align="center">4</td>
<td align="center">4</td>
<td align="center">4</td>
<td align="center">4</td>
<td align="center">4</td>
</tr>
<tr>
<td align="left">&#x2003;Epochs</td>
<td align="center">10</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
</tr>
<tr>
<td align="left">&#x2003;Batch Size</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
<td align="center">16</td>
</tr>
<tr>
<td align="left">&#x2003;LSTM Units in the Hidden Layer</td>
<td align="center">NA<sup>1</sup>
</td>
<td align="center">50</td>
<td align="center">200</td>
<td align="center">400</td>
<td align="center">200, 100</td>
<td align="center">Bi-LSTM (400), Stacked LSTM (200, 100)</td>
</tr>
<tr>
<td align="left">&#x2003;Number of Neurons in the Dense Layer</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">&#x2003;Inputs Shuffling</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">Bi-LSTM (Yes), Stacked LSTM (No)</td>
</tr>
<tr>
<td align="left">&#x2003;Dropout at Input Layer</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
</tr>
<tr>
<td align="left">&#x2003;Dropout at Dense Layer</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
</tr>
<tr>
<td colspan="7" align="center">
<bold>Optimal parameters of the different models on the Kumarhatti dataset</bold>
</td>
</tr>
<tr>
<td align="left">
<bold>Parameters</bold>
</td>
<td align="center">
<bold>Convolutional LSTM</bold>
</td>
<td align="center">
<bold>CNN-LSTM</bold>
</td>
<td align="center">
<bold>Simple LSTM</bold>
</td>
<td align="center">
<bold>Bidirectional LSTM</bold>
</td>
<td align="center">
<bold>Stacked LSTM</bold>
</td>
<td align="center">
<bold>BS-LSTM</bold>
</td>
</tr>
<tr>
<td align="left">&#x2003;Layers</td>
<td align="center">4</td>
<td align="center">6</td>
<td align="center">3</td>
<td align="center">3</td>
<td align="center">4</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">&#x2003;Lags or Look-back period</td>
<td align="center">25</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">&#x2003;Epochs</td>
<td align="center">10</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
<td align="center">50</td>
</tr>
<tr>
<td align="left">&#x2003;Batch Size</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
<td align="center">1,024</td>
</tr>
<tr>
<td align="left">&#x2003;LSTM Units in the Hidden Layer</td>
<td align="center">NA<sup>1</sup>
</td>
<td align="center">300</td>
<td align="center">240</td>
<td align="center">410</td>
<td align="center">160, 330</td>
<td align="center">Bi-LSTM (50), Stacked LSTM (250, 300)</td>
</tr>
<tr>
<td align="left">&#x2003;Number of Neurons in the Dense Layer</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">&#x2003;Inputs Shuffling</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">yes</td>
<td align="center">Bi-LSTM (Yes), Stacked LSTM (Yes)</td>
</tr>
<tr>
<td align="left">&#x2003;Dropout at Input Layer</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
</tr>
<tr>
<td align="left">&#x2003;Dropout at Dense Layer</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
<td align="center">0.0</td>
</tr>
<tr>
<td colspan="7" align="center">NA<sup>1</sup>: Not Applicable</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>
<xref ref-type="fig" rid="F8">Figure&#x20;8</xref> depicts the top-performing BS-LSTM model&#x2019;s training and test fit over five boreholes from the Tangni site and one borehole from the Kumarhatti&#x20;site.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>The best performing BS-LSTM model showing the soil movements (in degree) over the Tangni landslide training and test datasets. <bold>(A)</bold> The sensor at 3&#xa0;m in borehole one&#xa0;at Tangni. <bold>(B)</bold> The sensor at 12&#xa0;m in borehole two at Tangni. <bold>(C)</bold> The sensor at 6&#xa0;m in borehole three at Tangni. <bold>(D)</bold> The sensor at 15&#xa0;m in borehole four&#xa0;at Tangni. <bold>(E)</bold> The sensor at 15&#xa0;m in borehole five at Tangni. <bold>(F)</bold> The best performing BS-LSTM model showing the soil movement (in meters) over the Kumarhatti&#x2019;s training and testing datasets.</p>
</caption>
<graphic xlink:href="feart-09-696792-g008.tif"/>
</fig>
</sec>
<sec id="s4">
<title>Discussion and Conclusion</title>
<p>A focus of recurrent neural network models could be the forecasting of soil movements to warn people about impending landslides. We developed a novel ensemble BS-LSTM model (a combination of a Bi-LSTM model and a stacked LSTM model). We calibrated the parameters of the BS-LSTM model on the Tangni and Kumarhatti datasets. The soil movement data from the Tangni and Kumarhatti were split into 80 and 20% ratios to train and test the LSTM models. The developed LSTM models first trained on the initial 80% training dataset with a one-step look ahead forecasting method and later tested on the remaining 20% testing dataset for both locations. Four performance measures MAE, RMSE, NRMSE, and MRE, were utilized to record the performance of the models.</p>
<p>For the Tangni and Kumarhatti datasets, the ensemble BS-LSTM was the best model to forecast the soil movements during model training and testing. The Bi-LSTM was the second-best model to forecast the soil movements for both datasets. An explanation for the BS-LSTM&#x2019;s performance might be that the inbuilt Bi-LSTM found more information in the input time series via training the model in both forward and backward directions. Next, the inbuilt stacked LSTM could utilize this information to forecast the soil movement values.</p>
<p>Another observation from this experiment that the LSTM models trained on the Tangni landslide dataset showed some overfitting, where the training error was less than the testing error. The one reason could be that the LSTM models have memory limitations in general. The LSTM models require more data to train their parameters, where the Tangni dataset has only 62 data points. The LSTM models were trained on the Kumarhatti dataset to investigate the memory limitations, and the results were reasonably good without overfitting. During the training and testing of the Kumarhatti dataset, the LSTM models showed almost no error in the soil movement forecasting.</p>
<p>The LSTM models in this paper were developed to forecast the sequence of soil movements. The CNN-LSTM and Conv-LSTM models were developed by ensembling of the CNN network and a simple LSTM model. The ensembling fed the spatial information of soil movements into the simple LSTM model, which increased the performance of these models. The developed Bi-LSTM and stacked LSTM models were the non-ensemble models. The BS-LSTM model was developed using the Bi-LSTM and stacked LSTM models. The BS-LSTM was compared with an ensemble of CNN and simple LSTM models (CNN-LSTM and Conv-LSTM) to forecast soil movements on the Tangni and Kumarhatti datasets. The training and test results demonstrated that the BS-LSTM models outperformed the ensemble of CNN and simple LSTM models.</p>
<p>Results reported in this paper have several implications for soil movement forecasts in the real world. First, the results show that an ensemble of RNN models (such as BS-LSTM) could be utilized to forecast soil movements at real-world landslide sites. Our findings show that an ensemble BS-LSTM outperforms the non-ensemble models and ensemble of CNN and simple LSTM models for forecasting of soil movements. Furthermore, this is the first attempt to use recurrent neural network models to model soil movements at the Tangni and Kumarhatti sites. Such an ensemble of recurrent neural networks may also have the scope in other fields such as social network analysis and natural language processing, respectively.</p>
<p>According to this article, recurrent neural network models might be useful in anticipating soil movements to warn people about impending landslides. During the training and testing of the Kumarhatti dataset, the LSTM models showed almost no error in the soil movement forecasting. In conclusion, the newly developed ensemble models were generalized to forecast the soil movements on&#x20;different landslides. In future, these models could be used to forecast the soil movements at other landslide sites in India and in other countries. The soil movement forecasting is a class imbalance problem where movement events may be lesser than non-movement events. Also, the machine learning models could show overfitting when training data is scarce. In such situations, the generative adversarial networks (GANs) could generate synthetic data of soil movements to solve class imbalance in datasets (<xref ref-type="bibr" rid="B1">Al-Najjar and Pradhan, 2021</xref>). For example, Al-Najjar and Pradhan, (2021) developed a GAN model for spatial landslide susceptibility assessment when training data was less. As a part of our future plan, we would like to extend this research by developing various GAN models for soil movement forecasting. A portion of these ideas forms the immediate stages in our project on soil-movement forecasts utilizing machine learning approaches.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>PK: Formulation of this study, methodology, and model implementation. PS: Data curation, writing, and original draft preparation. PC: Data collection. KU: Validation of the results and geotechnical parameters. VD: The compilation, examination, and interpretation of data for the&#x20;work.</p>
</sec>
<sec id="s7">
<title>Funding</title>
<p>The project was supported by grants (awards: IITM/DST/VKU/300, IITM/DST/KVU/316, and IITM/DDMA-M/VD/325) to&#x20;VD.</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s9" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>We are grateful to the Department of Science and Technology and to District Disaster Management Authority Mandi for providing the funding for this project. We are also grateful to the Indian Institute of Technology Mandi for providing computational resources for this project.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Al-Najjar</surname>
<given-names>H. A. H.</given-names>
</name>
<name>
<surname>Pradhan</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Spatial Landslide Susceptibility Assessment Using Machine Learning Techniques Assisted by Additional Data Created With Generative Adversarial Networks</article-title>. <source>Geosci. Front.</source> <volume>12</volume> (<issue>2</issue>), <fpage>625</fpage>&#x2013;<lpage>637</lpage>. <pub-id pub-id-type="doi">10.1016/j.gsf.2020.09.002</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barzegar</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Aalami</surname>
<given-names>M. T.</given-names>
</name>
<name>
<surname>Adamowski</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Short-Term Water Quality Variable Prediction Using a Hybrid CNN-LSTM Deep Learning Model</article-title>. <source>Stoch Environ. Res. Risk Assess.</source> <volume>34</volume>, <fpage>415</fpage>&#x2013;<lpage>433</lpage>. <pub-id pub-id-type="doi">10.1007/s00477-020-01776-2</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Behera</surname>
<given-names>R. K.</given-names>
</name>
<name>
<surname>Naik</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Rath</surname>
<given-names>S. K.</given-names>
</name>
<name>
<surname>Dharavath</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Genetic Algorithm-Based Community Detection in Large-Scale Social Networks</article-title>. <source>Neural Comput. Appl.</source>, <fpage>1</fpage>&#x2013;<lpage>17</lpage>. </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Behera</surname>
<given-names>R. K.</given-names>
</name>
<name>
<surname>Naik</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Ramesh</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Rath</surname>
<given-names>S. K.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Mr-ibc: Mapreduce-Based Incremental Betweenness Centrality in Large-Scale Complex Networks</article-title>. <source>Social Netw. Anal. Mining.</source> <volume>10</volume> (<issue>1</issue>), <fpage>1</fpage>&#x2013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1007/s13278-020-00636-9</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Behera</surname>
<given-names>R. K.</given-names>
</name>
<name>
<surname>Sahoo</surname>
<given-names>K. S.</given-names>
</name>
<name>
<surname>Naik</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Rath</surname>
<given-names>S. K.</given-names>
</name>
<name>
<surname>Sahoo</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2021a</year>). <article-title>Structural Mining for Link Prediction Using Various Machine Learning Algorithms</article-title>. <source>Int. J.&#x20;Soc. Ecol. Sustainable Development (Ijsesd).</source> <volume>12</volume> (<issue>3</issue>), <fpage>66</fpage>&#x2013;<lpage>78</lpage>. <pub-id pub-id-type="doi">10.4018/ijsesd.2021070105</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Behera</surname>
<given-names>R. K.</given-names>
</name>
<name>
<surname>Jena</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Rath</surname>
<given-names>S. K.</given-names>
</name>
<name>
<surname>Misra</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2021b</year>). <article-title>Co-LSTM: Convolutional LSTM Model for Sentiment Analysis in Social Big Data</article-title>. <source>Inf. Process. Management.</source> <volume>58</volume> (<issue>1</issue>), <fpage>102435</fpage>. <pub-id pub-id-type="doi">10.1016/j.ipm.2020.102435</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Behera</surname>
<given-names>R. K.</given-names>
</name>
<name>
<surname>Shukla</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Rath</surname>
<given-names>S. K.</given-names>
</name>
<name>
<surname>Misra</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Software Reliability Assessment Using Machine Learning Technique</article-title>. In <conf-name>International Conference on Computational Science and Its Applications</conf-name>. <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>, <fpage>403</fpage>&#x2013;<lpage>411</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-95174-4_32</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bengio</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Simard</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Frasconi</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>1994</year>). <article-title>Learning Long-Term Dependencies With Gradient Descent Is Difficult</article-title>. <source>IEEE Trans. Neural Netw.</source> <volume>5</volume> (<issue>2</issue>), <fpage>157</fpage>&#x2013;<lpage>166</lpage>. <pub-id pub-id-type="doi">10.1109/72.279181</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chand</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Spatial Trends and Pattern of Landslides in the Hill State&#x20;Himachal Pradesh</article-title>. <source>Zenith Int. J.&#x20;Multidisciplinary Res.</source> <volume>4</volume> (<issue>12</issue>), <fpage>200</fpage>&#x2013;<lpage>210</lpage>. </citation>
</ref>
<ref id="B10">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Chaturvedi</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Srivastava</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Kaur</surname>
<given-names>P. B.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Landslide Early Warning System Development Using Statistical Analysis of Sensors&#x27; Data at Tangni Landslide, Uttarakhand, india</article-title>,&#x201d; in <conf-name>Proceedings of Sixth International Conference on Soft Computing for Problem Solving</conf-name>. Editors <person-group person-group-type="editor">
<name>
<surname>Deep</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Bansal</surname>
<given-names>J.C.</given-names>
</name>
<name>
<surname>Das</surname>
<given-names>K.N.</given-names>
</name>
<name>
<surname>Lal</surname>
<given-names>A.K.</given-names>
</name>
<name>
<surname>Garg</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Nagar</surname>
<given-names>A.K.</given-names>
</name>
<etal/>
</person-group> (<publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer Singapore</publisher-name>), <fpage>259</fpage>&#x2013;<lpage>270</lpage>. <pub-id pub-id-type="doi">10.1007/978-981-10-3325-4_26</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cui</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Ke</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Pu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2018</year>). &#x201c;<article-title>Deep Stacked Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction</article-title>,&#x201d; in <conf-name>Proceedings of 6th International Workshop on Urban Computing (UrbComp 2017)</conf-name>, <conf-loc>Halifax, NS, Canada</conf-loc>, <fpage>1</fpage>&#x2013;<lpage>11</lpage>.</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cui</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Hao</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Landslide Image Captioning Method Based on Semantic Gate and Bi-Temporal Lstm</article-title>. <source>ISPRS Inter. J.&#x20;Geo-Infor.</source> <volume>9</volume>, <fpage>194</fpage>&#x2013;<lpage>233</lpage>. <pub-id pub-id-type="doi">10.3390/ijgi9040194</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gal</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ghahramani</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2016</year>). <source>A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. In 30th Conference on Neural Information Processing Systems (NIPS 2016)</source>. <publisher-loc>Cambridge, Massachusetts</publisher-loc>: <publisher-name>Massachusetts Institute of Technology Press</publisher-name> <volume>29</volume>, <fpage>1019</fpage>&#x2013;<lpage>1027</lpage>.</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hochreiter</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Schmidhuber</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Long Short-Term Memory</article-title>. <source>Neural Comput.</source> <volume>9</volume>, <fpage>1735</fpage>&#x2013;<lpage>1780</lpage>. <pub-id pub-id-type="doi">10.1162/neco.1997.9.8.1735</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>K.</given-names>
</name>
</person-group>, (<year>2015</year>). <article-title>Bidirectional Lstm-Crf Models for Sequence Tagging</article-title>. <comment>ArXiv abs/1508.01991</comment>. </citation>
</ref>
<ref id="B16">
<citation citation-type="book">
<collab>ICOMOS</collab> (<year>2008</year>). <source>Technical Evaluation Mission: 11-16 September 2008, 911:11&#x2013;16</source>. <publisher-loc>Paris: France.</publisher-loc>
</citation>
</ref>
<ref id="B17">
<citation citation-type="web">
<collab>IndiaNews</collab> (<year>2013</year>). <article-title>Landslides Near Badrinath in Uttarakhand</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://tinyurl.com/y3vv9edv">https://tinyurl.com/y3vv9edv</ext-link>. (Accessed Apr 7, 2019)</comment>. </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Hong</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Glade</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Yin</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Landslide Displacement Prediction Combining LSTM and SVR Algorithms: A Case Study of Shengjibao Landslide from the Three Gorges Reservoir Area</article-title>. <source>Appl. Sci.</source> <volume>10</volume> (<issue>21</issue>), <fpage>7830</fpage>. <pub-id pub-id-type="doi">10.3390/app10217830</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kahlon</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Chandel</surname>
<given-names>V. B.</given-names>
</name>
<name>
<surname>Brar</surname>
<given-names>K. K.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Landslides in Himalayan Mountains: a Study of Himachal Pradesh, India</article-title>. <source>Int. J.&#x20;IT Eng. Appl. Sci. Res.</source> <volume>3</volume>, <fpage>28</fpage>&#x2013;<lpage>34</lpage>. </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khanduri</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Landslide Distribution and Damages During 2013 Deluge: A Case Study of Chamoli District, Uttarakhand</article-title>. <source>J.&#x20;Geogr. Nat. Disasters.</source> <volume>08</volume>, <fpage>2167</fpage>&#x2013;<lpage>0587</lpage>. <pub-id pub-id-type="doi">10.4172/2167-0587.1000226</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Korup</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Stolle</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Landslide Prediction From Machine Learning</article-title>. <source>Geology. Today.</source> <volume>30</volume>, <fpage>26</fpage>&#x2013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.1111/gto.12034</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Priyanka</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pathania</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Agarwal</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Mali</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Predictions of Weekly Slope Movements Using Moving-Average and Neural Network Methods: A Case Study in Chamoli, India</article-title>. <source>Soft Comput. Problem Solving.</source> <volume>2019</volume>, <fpage>67</fpage>&#x2013;<lpage>81</lpage>. <pub-id pub-id-type="doi">10.1007/978-981-15-3287-0_6</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sihag</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pathania</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Agarwal</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Mali</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Chaturvedi</surname>
<given-names>P.</given-names>
</name>
<etal/>
</person-group> (<year>2019a</year>). <article-title>Landslide Debris-Flow Prediction Using Ensemble and Non-ensemble Machine-Learning Methods: A Case-Study in Chamoli, India</article-title>. In <conf-name>Contributions to Statistics: Proceedings of the 6th International Conference on Time Series and Forecasting (ITISE)</conf-name>. <publisher-loc>Granda, Spain</publisher-loc>: <publisher-name>Springer</publisher-name>, <fpage>614</fpage>&#x2013;<lpage>625</lpage>. </citation>
</ref>
<ref id="B24">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sihag</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pathania</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Agarwal</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Mali</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<etal/>
</person-group> (<year>2019b</year>). <article-title>Predictions of Weekly Soil Movements Using Moving-Average and Support-Vector Methods: A Case-Study in Chamoli, India</article-title>. In <conf-name>International Conference on Information technology in Geo-Engineering</conf-name>. <publisher-name>Springer</publisher-name>, <fpage>393</fpage>&#x2013;<lpage>405</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-32029-4_34</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sihag</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pathania</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Chaturvedi</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Uday</surname>
<given-names>K. V.</given-names>
</name>
<name>
<surname>Dutt</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2021a</year>). &#x201c;<article-title>Comparison of Moving-Average, Lazy, and Information Gain Methods for Predicting Weekly Slope-Movements: A Case-Study in Chamoli, India</article-title>,&#x201d; in <source>Understanding and Reducing Landslide Disaster Risk. WLF 2020. ICL Contribution to Landslide Disaster Risk Reduction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Casagli</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Tofani</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Sassa</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Bobrowsky</surname>
<given-names>P. T.</given-names>
</name>
<name>
<surname>Takara</surname>
<given-names>K.</given-names>
</name>
</person-group> (<publisher-loc>Springer, Cham</publisher-loc>. <pub-id pub-id-type="doi">10.1007/978-3-030-60311-3_38</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sihag</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Pathania</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Chaturvedi</surname>
<given-names>P.</given-names>
</name>
<etal/>
</person-group> (<year>2021b</year>). <article-title>Prediction of Real-World Slope Movements via Recurrent and Non-Recurrent Neural Network Algorithms: A Case Study of the Tangni Landslide</article-title>. <source>Indian Geotechnical J.</source>, <fpage>1</fpage>&#x2013;<lpage>23</lpage>. </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumari</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Behera</surname>
<given-names>R. K.</given-names>
</name>
<name>
<surname>Sahoo</surname>
<given-names>K. S.</given-names>
</name>
<name>
<surname>Nayyar</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kumar Luhach</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Prakash Sahoo</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Supervised Link Prediction Using Structured-Based Feature Extraction in the Social Networks</article-title>. <source>Concurrency Comput. Pract. Experience.</source>, <fpage>e5839</fpage>. <pub-id pub-id-type="doi">10.1002/cpe.5839</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Yin</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bredin</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Barras</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>LSTM Based Similarity Measurement With Spectral Clustering for Speaker Diarization</article-title>,&#x201d; in <conf-name>proceeding Interspeech 2019</conf-name>, <conf-loc>Graz, Austria</conf-loc>. <pub-id pub-id-type="doi">10.21437/interspeech.2019-1388</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Z.-q.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Lacasse</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.-h.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>B.-b.</given-names>
</name>
<name>
<surname>Choi</surname>
<given-names>J.-c.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Algorithms for Intelligent Prediction of Landslide Displacements</article-title>. <source>J.&#x20;Zhejiang Univ. Sci. A.</source> <volume>21</volume>, <fpage>412</fpage>&#x2013;<lpage>429</lpage>. <pub-id pub-id-type="doi">10.1631/jzus.a2000005</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Medsker</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>L. C.</given-names>
</name>
</person-group> (<year>1999</year>). <source>Recurrent Neural Networks: Design and ApplicationsInternational Series on Computational Intelligence</source>. <publisher-loc>Boca Raton, Florida</publisher-loc>; <publisher-name>CRC Press</publisher-name>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://books.google.co.in/books?%20id=ME1SAkN0PyMC">https://books.google.co.in/books? id&#x3d;ME1SAkN0PyMC</ext-link>
</comment>.</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Meng</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Gu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>L.</given-names>
</name>
</person-group>, (<year>2020</year>). <article-title>Displacement Prediction of Water-Induced Landslides Using a Recurrent Deep Learning Model</article-title>. <source>Eur. J.&#x20;Environ. Civil Eng.</source> <volume>1</volume>, <fpage>1</fpage>, <lpage>15</lpage>. <pub-id pub-id-type="doi">10.1080/19648189.2020.1763847</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mikolov</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kombrink</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Burget</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Cernocky</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Khudanpur</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Extensions of Recurrent Neural Network Language Model</article-title>. <conf-name>In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</conf-name> <publisher-loc>Prague</publisher-loc>: <publisher-name>Czech Republic. IEEE Xplore</publisher-name>, <fpage>5528</fpage>&#x2013;<lpage>5531</lpage>. <pub-id pub-id-type="doi">10.1109/ICASSP.2011.5947611</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Niu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A Novel Decomposition-Ensemble Learning Model Based on Ensemble Empirical Mode Decomposition and Recurrent Neural Network for Landslide Displacement Prediction</article-title>. <source>Appl. Sci.</source> <volume>11</volume> (<issue>10</issue>), <fpage>4684</fpage>. <pub-id pub-id-type="doi">10.3390/app11104684</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pande</surname>
<given-names>R. K.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Landslide Problems in Uttaranchal, India: Issues and Challenges</article-title>. <source>Disaster Prev. Management.</source> <volume>15</volume>, <fpage>247</fpage>&#x2013;<lpage>255</lpage>. <pub-id pub-id-type="doi">10.1108/09653560610659793</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Pathania</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sihag</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Chaturvedi</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Uday</surname>
<given-names>K. V.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>A Low Cost, Sub-Surface Iot Framework for Landslide Monitoring, Warning, and Prediction</article-title>. In <conf-name>Proceedings of 2020 International conference on advances in computing, communication, embedded and secure systems</conf-name>. </citation>
</ref>
<ref id="B36">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Pham</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Bluche</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kermorvant</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Louradour</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Dropout Improves Recurrent Neural Networks for Handwriting Recognition</article-title>. In <conf-name>2014 14th international conference on frontiers in handwriting recognition</conf-name>. <publisher-name>IEEE</publisher-name>, <fpage>285</fpage>&#x2013;<lpage>290</lpage>. </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qiu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Dgeosegmenter: A Dictionary-Based Chinese Word Segmenter for the Geoscience Domain</article-title>. <source>Comput. Geosciences.</source> <volume>121</volume>, <fpage>1</fpage>&#x2013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1016/j.cageo.2018.08.006</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Shi</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Yeung</surname>
<given-names>D. Y.</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>W. K.</given-names>
</name>
<name>
<surname>Woo</surname>
<given-names>W. c.</given-names>
</name>
</person-group> (<year>2015</year>). <source>Convolutional Lstm Network: A Machine Learning Approach for Precipitation Nowcasting</source>. <publisher-loc>Cambridge</publisher-loc>, <fpage>802</fpage>&#x2013;<lpage>810</lpage>.</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Singh</surname>
<given-names>U.</given-names>
</name>
<name>
<surname>Determe</surname>
<given-names>J.&#x20;F.</given-names>
</name>
<name>
<surname>Doncker</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Horlin</surname>
<given-names>F.</given-names>
</name>
</person-group>, (<year>2020</year>). <article-title>Crowd Forecasting Based on Wifi Sensors and Lstm Neural Networks</article-title>. <source>IEEE Trans. Instrumentation Meas.</source> <volume>69</volume>, <fpage>6121</fpage>&#x2013;<lpage>6131</lpage> <pub-id pub-id-type="doi">10.1109/TIM.2020.2969588</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="other">
<collab>THDC</collab> (<year>2009</year>). <source>Baseline environment, impacts and mitigation measures</source>. <comment>Available at <ext-link ext-link-type="uri" xlink:href="https://thdc.co.in/sites/default/files/VPHEP-Env-VOL2.pdf">https://thdc.co.in/sites/default/files/VPHEP-Env-VOL2.pdf</ext-link>
</comment> (<comment>Accessed August 16, 2021</comment>).</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Surya</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Historical Records of Socio-Economically Significant Landslides in india</article-title>. <source>J.&#x20;South. Asia Disaster Stud.</source> <volume>4</volume>, <fpage>177</fpage>&#x2013;<lpage>204</lpage>. </citation>
</ref>
<ref id="B41">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>L.-C.</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>K. R.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model</article-title>. In <conf-name>Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)</conf-name>. <publisher-loc>Berlin, Germany</publisher-loc>, <fpage>225</fpage>&#x2013;<lpage>230</lpage>. <pub-id pub-id-type="doi">10.18653/v1/P16-2037</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Fang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Hong</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Comparative Study of Landslide Susceptibility Mapping With Different Recurrent Neural Networks</article-title>. <source>Comput. Geosciences.</source> <volume>138</volume>, <fpage>104445</fpage>. <pub-id pub-id-type="doi">10.1016/j.cageo.2020.104445</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Westen</surname>
<given-names>C. J.&#x20;v.</given-names>
</name>
<name>
<surname>Rengers</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Terlien</surname>
<given-names>M. T. J.</given-names>
</name>
<name>
<surname>Soeters</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Prediction of the Occurrence of Slope Instability Phenomenal through GIS-Based hazard Zonation</article-title>. <source>Geologische Rundschau.</source> <volume>86</volume>, <fpage>404</fpage>&#x2013;<lpage>414</lpage>. <pub-id pub-id-type="doi">10.1007/s005310050149</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xing</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yue</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Cong</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Bian</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Dynamic Displacement Forecasting of Dashuitian Landslide in China Using Variational Mode Decomposition and Stack Long Short-Term Memory Network</article-title>. <source>Appl. Sci.</source> <volume>9</volume>, <fpage>2951</fpage>. <pub-id pub-id-type="doi">10.3390/app9152951</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xing</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yue</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Qin</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>A Hybrid Prediction Model of Landslide Displacement With Risk-Averse Adaptation</article-title>. <source>Comput. Geosciences.</source> <volume>141</volume>, <fpage>104527</fpage>. <pub-id pub-id-type="doi">10.1016/j.cageo.2020.104527</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Niu</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Displacement Prediction of Baijiabao Landslide Based on Empirical Mode Decomposition and Long Short-Term Memory Neural Network in Three Gorges Area, china</article-title>. <source>Comput. Geosciences.</source> <volume>111</volume>, <fpage>87</fpage>&#x2013;<lpage>96</lpage>. <pub-id pub-id-type="doi">10.1016/j.cageo.2017.10.013</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Yin</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Lacasse</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Time Series Analysis and Long Short-Term Memory Neural Network to Predict Landslide Displacement</article-title>. <source>Landslides.</source> <volume>16</volume>, <fpage>677</fpage>&#x2013;<lpage>694</lpage>. <pub-id pub-id-type="doi">10.1007/s10346-018-01127-x</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Qu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A Novel Hierarchical Algorithm for Bearing Fault Diagnosis Based on Stacked Lstm</article-title>. <source>Shock and Vibration.</source> <volume>2019</volume>, <fpage>1</fpage>&#x2013;<lpage>10</lpage>. <pub-id pub-id-type="doi">10.1155/2019/2756284</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Dynamic Forecast Model for Landslide Susceptibility Based on Deep Learning Methods</article-title>,&#x201d; in <conf-name>proceedings 21st EGU General Assembly, EGU2019</conf-name>, <conf-loc>Vienna, Austria</conf-loc>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://ui.adsabs.harvard.edu/abs/2019EGUGA.21.8941Z">https://ui.adsabs.harvard.edu/abs/2019EGUGA.21.8941Z</ext-link>
</comment>.</citation>
</ref>
</ref-list>
</back>
</article>