<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "archivearticle.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Appl. Math. Stat.</journal-id>
<journal-title>Frontiers in Applied Mathematics and Statistics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Appl. Math. Stat.</abbrev-journal-title>
<issn pub-type="epub">2297-4687</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fams.2023.1124091</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Applied Mathematics and Statistics</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Unsupervised vessel trajectory reconstruction</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Chen</surname> <given-names>Chih-Wei</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2157883/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Huang</surname> <given-names>Hsin-Hsiung</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1290150/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Applied Mathematics, National Sun Yat-sen University</institution>, <addr-line>Kaohsiung</addr-line>, <country>Taiwan</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Statistics and Data Science, University of Central Florida</institution>, <addr-line>Orlando, FL</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Eric Chung, The Chinese University of Hong Kong, China</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Stefano Marrone, Universit&#x000E0; Della Campania &#x0201C;Luigi Vanvitelli&#x0201D;, Italy; Jes&#x000FA;s Garc&#x000ED;a, Universidad Carlos III de Madrid, Spain</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Hsin-Hsiung Huang <email>hsin-hsiung.huang&#x00040;ucf.edu</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Mathematics of Computation and Data Science, a section of the journal Frontiers in Applied Mathematics and Statistics</p></fn></author-notes>
<pub-date pub-type="epub">
<day>17</day>
<month>03</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>9</volume>
<elocation-id>1124091</elocation-id>
<history>
<date date-type="received">
<day>14</day>
<month>12</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>27</day>
<month>02</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Chen and Huang.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Chen and Huang</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>A trajectory is a sequence of observations in time and space, for examples, the path formed by maritime vessels, orbital debris, or aircraft. It is important to track and reconstruct vessel trajectories using the Automated Identification System (AIS) data in real-world applications for maritime navigation safety. In this project, we use the National Science Foundation (NSF)&#x00027;s Algorithms for Threat Detection program (ATD) 2019 Challenge AIS data to develop novel trajectory reconstruction method. Given a sequence of <italic>N</italic> unlabeled timestamped observations <inline-formula><mml:math id="M1"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>x</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>x</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>x</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>, the goal is to track trajectories by clustering the AIS points with predicted positions using the information from the true trajectories <inline-formula><mml:math id="M2"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow></mml:math></inline-formula>. It is a natural way to connect the observed point <bold>x</bold><sub>&#x000EE;</sub> with the closest point that is estimated by using the location, time, speed, and angle information from a set of the points under consideration <bold>x</bold><sub><italic>i</italic></sub> &#x02200; <italic>i</italic> &#x02208; {1, 2, &#x02026;, <italic>N</italic>}. The introduced method is an unsupervised clustering-based method that does not train a supervised model which may incur a significant computational cost, so it leads to a real-time, reliable, and accurate trajectory reconstruction method. Our experimental results show that the proposed method successfully clusters vessel trajectories.</p></abstract>
<kwd-group>
<kwd>Automatic Identification System (AIS)</kwd>
<kwd>clustering</kwd>
<kwd>Long Short-Term Memory (LSTM)</kwd>
<kwd>trajectory prediction</kwd>
<kwd>trajectory reconstruction</kwd>
</kwd-group>
<contract-sponsor id="cn001">Division of Mathematical Sciences<named-content content-type="fundref-id">10.13039/100000121</named-content></contract-sponsor>
<counts>
<fig-count count="6"/>
<table-count count="5"/>
<equation-count count="4"/>
<ref-count count="40"/>
<page-count count="9"/>
<word-count count="6674"/>
</counts>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1. Introduction to trajectory reconstruction</title>
<p>The Automatic Identification System (AIS) is an automatic tracking system which all ships over 300 gross tonnage and passenger ships are required to be installed aboard according to a mandate for maritime security according to the International Convention for the Safety of Life at Sea issued by the International Maritime Organization (IMO) to avoid ship collisions [<xref ref-type="bibr" rid="B1">1</xref>, <xref ref-type="bibr" rid="B2">2</xref>]. To address the challenges of tracking moving vessels using both space and time information to detect anomalous trajectories, the National Geospatial-intelligence Agency (NGA) has collaborated with the National Science Foundation (NSF)&#x00027;s Algorithms for Threat Detection program (ATD) for providing the ATD 2019 Challenge. The ATD 2019 AIS data [<xref ref-type="bibr" rid="B3">3</xref>] contain time-stamped information about a maritime vessel&#x00027;s movement including latitude, longitude, course over ground (angle), and speed over ground. The ATD 2019 Challenge is tracking the vessel trajectories in real time even when the AIS data may not have completely recorded vessel ID information due to technical issues or operational concerns. In this situation, there is no training set for applying supervised methods to identify the vessel and predict trajectories, and hence unsupervised methods are required. Although the existing unsupervised clustering methods can be used for predicting trajectories of vessels, they may not be able to provide desired prediction accuracy [<xref ref-type="bibr" rid="B4">4</xref>]. We propose an unsupervised trajectory reconstruction method can be used for space debris path prediction since space debris typically lack known labels for model training [<xref ref-type="bibr" rid="B5">5</xref>], and analyze and investigate three AIS datasets provided by NSF&#x00027;s ATD program and collected from the 1st of June to the 31st of July, 2019 (see <xref ref-type="table" rid="T1">Table 1</xref>).</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>The three AIS datasets.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>AIS dataset</bold></th>
<th valign="top" align="left"><bold>Time span (hh:mm:ss)</bold></th>
<th valign="top" align="left"><bold>Latitude span</bold></th>
<th valign="top" align="left"><bold>Longitude span</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">From 14:00:00 to 17:59:58</td>
<td valign="top" align="left">From 36.906505&#x000B0; to 37.049995&#x000B0;</td>
<td valign="top" align="left">From &#x02212;76.329934&#x000B0; to &#x02212;75.98009&#x000B0;</td>
</tr> <tr>
<td valign="top" align="left">2</td>
<td valign="top" align="left">From 14:00:00 to 17:59:59</td>
<td valign="top" align="left">From 36.906063&#x000B0; to 37.049933&#x000B0;</td>
<td valign="top" align="left">From &#x02212;76.329982&#x000B0; to &#x02212;75.98&#x000B0;</td>
</tr> <tr>
<td valign="top" align="left">3</td>
<td valign="top" align="left">From 14:00:00 to 17:59:58</td>
<td valign="top" align="left">From 36.906038&#x000B0; to 37.04974&#x000B0;</td>
<td valign="top" align="left">From &#x02212;76.329979&#x000B0; to &#x02212;75.980184&#x000B0;</td>
</tr></tbody>
</table>
</table-wrap>
<p>We use the term trajectory reconstruction for estimating the AIS positions and connecting them as trajectories [<xref ref-type="bibr" rid="B6">6</xref>]. The existing works of trajectory reconstruction include linear interpolation, curvilinear interpolation [<xref ref-type="bibr" rid="B7">7</xref>], and its improvements [<xref ref-type="bibr" rid="B8">8</xref>, <xref ref-type="bibr" rid="B9">9</xref>], and Recurrent Neural Networks (RNNs) [<xref ref-type="bibr" rid="B10">10</xref>]. Some of these methods employ physical models of movement information such as speeds, directions, and time, and typically use the speed over ground and course over ground, and others assume a distribution of vessel trajectories and train it from historical records [<xref ref-type="bibr" rid="B11">11</xref>, <xref ref-type="bibr" rid="B12">12</xref>]. The state-of-the-art methods for trajectory reconstruction [<xref ref-type="bibr" rid="B13">13</xref>&#x02013;<xref ref-type="bibr" rid="B15">15</xref>] generally have the following three steps: (1) apply a clustering method [<xref ref-type="bibr" rid="B16">16</xref>, <xref ref-type="bibr" rid="B17">17</xref>] to group trajectories data according to their route patterns, (2) assign the vessel to one of these clusters, and (3) interpolate or predict the vessel trajectory based on the route pattern of the assigned cluster. However, these methods requires a training set of stationary patterns such as paths in long time and distances, and hence they are not applicable to the three AIS datasets that we analyzed consist of short-term and distances trajectories which lack the long-term patterns.</p>
<p>Our main contributions include: (1) The design of a novel big-data-compliant unsupervised algorithm which automatically learns and extracts useful spatiotemporal information from AIS data; (2) The proposed spatiotemporal features improve the accuracy of clustering the AIS points and reconstructing trajectories; (3) The proposed method has been successfully applied to reconstructed vessel trajectories with the real AIS data collected nearby Norfolk, Virginia, and simulated data. The highlights of this paper are summarized as follows. The proposed vessel trajectory reconstruction method utilizing the spatiotemporal characteristics of AIS data is unsupervised, and therefore it does not require a training set. The experimental results demonstrate the advantages of the proposed method when the training set is insufficient. Unlike the traditional clustering method, the proposed method uses the points with features represented by its projected positions based on speeds and angles, so the computation only involve local information and thus runs fast.</p>
</sec>
<sec id="s2">
<title>2. Next-point nearest neighbor clustering method</title>
<p>We first introduce the next-point concept with nearest neighbor classification method and then develop the nearest neighborhood clustering (NNC) when the vessel IDs are unknown using the proposed next-point method. We introduce a basic NNC method and design an advanced NNC trajectory reconstruction in this section. We will compare results of all these methods in the next section.</p>
<sec>
<title>2.1. Next-point connection</title>
<p>We convert the longitude and latitude into the Universal Transverse Mercator (UTM) coordinates, and then group the AIS points by the proposed nearest-neighbor clustering method. The next-point connection (NPC) clustering algorithm uses the distance defined as</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo class="qopname">min</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>K</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x02264;</mml:mo><mml:mi>s</mml:mi><mml:mo>&#x0003C;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>O</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>K</italic> is the index set of the AIS points in interval [<italic>t</italic><sub>0</sub>, <italic>t</italic>] and <italic>d</italic><sub><italic>st</italic></sub> is the space-time distance which is the Euclidean distance using all spatial and temporal features, [<italic>t</italic><sub>0</sub>, <italic>t</italic>] is a preset search range of time (the interval length 1, 000 s used in our analysis), <italic>E</italic> and <italic>O</italic> stand for the estimated location and observed location at time <italic>t</italic>, respectively, and <italic>s</italic> is the set of variables used for finding the closest training points. The proposed clustering method contains the following steps:</p>
<list list-type="bullet">
<list-item><p>Step 1. Project each point&#x00027;s next location using its speed, direction, and the time differences between the point and its neighboring points.</p></list-item>
<list-item><p>Step 2. Find the closest location for each estimated point&#x00027;s location <italic>E</italic><sub><italic>i</italic></sub>(<italic>s</italic>) from each label <italic>i</italic> &#x02208; <italic>K</italic> before the test point&#x00027;s time <italic>t</italic><sub>0</sub> &#x02264; <italic>s</italic> &#x0003C; <italic>t</italic>.</p></list-item>
<list-item><p>Step 3. Assign the predicted label to the observed point <italic>O</italic>(<italic>t</italic>) based on its closest location <italic>E</italic><sub><italic>i</italic></sub>(<italic>s</italic>) in Step 2.</p></list-item>
</list>
<p>Although the NPC method is similar to the minimum spanning tree (MST) and single linkage cluster analysis (SLCA) [<xref ref-type="bibr" rid="B18">18</xref>, <xref ref-type="bibr" rid="B19">19</xref>] that combine two clusters with the closest pair of points, NPC uses the estimated position <italic>E</italic> to measure the distance instead of the observed positions and NPC only searches AIS points in a nearby time interval. When the labels of the AIS points in <italic>K</italic> are known, the NPC method becomes a classification method and some points from the same vessel can be removed and only the AIS point with time closest to <italic>t</italic> will be used. The NPC methods that use the nearest neighbors to predict a vessel VID at each time <italic>t</italic>, they have the weaknesses: (1) NPC classification requires known labels which may not be available; (2) NPC clustering may merge different vessels and some feature with large values may dominate the distance. Therefore, we focus on the clustering method and propose the following algorithm to solve these issues.</p>
</sec>
<sec>
<title>2.2. Trajectory-based clustering</title>
<p>We propose an clustering algorithm which is based on trajectory reconstruction and thus called CBTR, which builds the trajectories of vessels by using local physical information. CBTR is based on an NPC clustering method which uses doubly checked distance to improve accuracy. For each point is the data set, we select another point as its best possible next point (BPNP) and put them in the same cluster. Like MST and SLCA, BPNP groups all points into a dendrogram with several tree-type clusters because two distinct points might have the same BPNP. The trajectory can be visualized by connecting all points with its BPNP with a line segment. See <bold>Figure 6</bold> for an illustration.</p>
<p>Given an AIS data set, in which points are ordered by their recording time <italic>t</italic><sub><italic>i</italic></sub>, we denote each point by <italic>x</italic><sub><italic>i</italic></sub>. The two-dimensional positions of <italic>x</italic><sub><italic>i</italic></sub> on the earth are denoted as <italic>p</italic><sub><italic>i</italic></sub> &#x0003D; [LAT(<italic>x</italic><sub><italic>i</italic></sub>), LON(<italic>x</italic><sub><italic>i</italic></sub>)] where LAT and LON stand for latitude and longitude of the AIS point <italic>x</italic>, and their speeds and courses are denoted by <italic>v</italic><sub><italic>i</italic></sub> and <italic>c</italic><sub><italic>i</italic></sub>, respectively. For every <italic>x</italic><sub><italic>i</italic></sub>, we use its velocity, namely speed and course, to predict its future position and look for the best possible next point <italic>x</italic><sub><italic>j</italic></sub> of it. If there is no point inside a reasonable searching range, then we consider <italic>x</italic><sub><italic>i</italic></sub> as an endpoint of a trajectory. We trace each trail till an endpoint occurs and thus finish the clustering. So the algorithm of CBTR is designed as follows:</p>
<list list-type="bullet">
<list-item><p>Step 1. For a given point <italic>x</italic><sub><italic>i</italic></sub> at time <italic>t</italic><sub><italic>i</italic></sub>, we derive the linearly approximate <italic>future trajectory</italic> &#x003B3;<sub><italic>i</italic></sub> within 1,000 s by using its instant speed <italic>v</italic><sub><italic>i</italic></sub> and course &#x003B8;, i.e., the predicted position is defined by &#x003B3;<sub><italic>i</italic></sub> &#x0003D; <italic>x</italic><sub><italic>i</italic></sub> &#x0002B; <italic>v</italic><sub><italic>i</italic></sub> &#x000B7; &#x00394;<italic>T</italic>, where &#x00394;<italic>T</italic> is the time period which will be chosen in Step 2.</p></list-item>
<list-item><p>Step 2. Collect all points appearing in the time zone <italic>t</italic> &#x02208; (<italic>t</italic><sub><italic>i</italic></sub>, <italic>t</italic><sub><italic>i</italic></sub> &#x0002B; 1000) and denote the collection as <inline-formula><mml:math id="M4"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Consider the closeness of &#x003B3;<sub><italic>i</italic></sub> and each point <italic>x</italic> in <inline-formula><mml:math id="M5"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> by computing a bi-directional distance <italic>D</italic> between &#x003B3;<sub><italic>i</italic></sub> and <italic>x</italic>, where &#x00394;<italic>T</italic> is chosen to be the time difference between <italic>x</italic><sub><italic>i</italic></sub> and <italic>x</italic>. Impose a spatiotemporal angle condition to exclude points with exaggerated turning course and denote the rest points as <inline-formula><mml:math id="M6"><mml:mover accent="false"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula>. Let the BPNP of <italic>x</italic><sub><italic>i</italic></sub> be the one in the collection <inline-formula><mml:math id="M7"><mml:mover accent="false"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula> which has smallest <italic>D</italic> and satisfies the angle condition. Denote this smallest <italic>D</italic> as <italic>D</italic><sub><italic>i</italic></sub>, namely, <inline-formula><mml:math id="M8"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:munder class="msub"><mml:mrow><mml:mo class="qopname">min</mml:mo></mml:mrow><mml:mrow><mml:mi>x</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mover accent="false"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo class="qopname">&#x0007E;</mml:mo></mml:mover></mml:mrow></mml:munder><mml:mi>D</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. When <inline-formula><mml:math id="M9"><mml:mover accent="false"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula> is empty, <italic>D</italic><sub><italic>i</italic></sub> is defined to be infinity.</p></list-item>
<list-item><p>Step 3. To choose a threshold in Step 3, we use the normalized distance <inline-formula><mml:math id="M10"><mml:mover accent="true"><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>. Sort all AIS points <italic>x</italic><sub><italic>i</italic></sub> according to <inline-formula><mml:math id="M11"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> in descending order. Compute the ratio <inline-formula><mml:math id="M12"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> and find the first <italic>i</italic> whose ratio is less than a threshold (1.2 was used in our experiments). Take <inline-formula><mml:math id="M13"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> as a threshold and treat an AIS point as an endpoint of a projected line if its <inline-formula><mml:math id="M14"><mml:mover accent="true"><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:math></inline-formula> is larger than the threshold. At last, cluster vessels by using these endpoints.</p></list-item>
</list>
<sec>
<title>2.2.1. Bi-directional distance</title>
<p>The bi-directional distance <italic>D</italic> and the turning angle condition are the most crucial elements in CBTR, so we provide details of them as follows.</p>
<p>We compute the (squared) distance between &#x003B3;<sub><italic>i</italic></sub> and points in <inline-formula><mml:math id="M15"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. If <inline-formula><mml:math id="M16"><mml:msup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mo>|</mml:mo><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msup><mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> is small, then <italic>x</italic><sub><italic>j</italic></sub> is probably the next AIS point of <italic>x</italic><sub><italic>i</italic></sub>. However, as shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, there might be another vessel (colored in red) appearing in the direction of &#x003B3;<sub><italic>i</italic></sub> (the black arrow). In order to catch the correct BPNP <italic>x</italic><sub><italic>m</italic></sub> for <italic>x</italic><sub><italic>i</italic></sub>, we use the information of <italic>x</italic><sub><italic>m</italic></sub> and <italic>x</italic><sub><italic>n</italic></sub> to do double check. Precisely, we compute the backward locations &#x003C3;<sub><italic>m</italic></sub> and &#x003C3;<sub><italic>n</italic></sub> of <italic>x</italic><sub><italic>m</italic></sub> and <italic>x</italic><sub><italic>n</italic></sub>, respectively, namely the gray arrow and red arrow in <xref ref-type="fig" rid="F1">Figure 1</xref>. One sees that the inconsistency between the black arrow and the red arrow can exclude <italic>x</italic><sub><italic>n</italic></sub> as a BPNP of <italic>x</italic><sub><italic>i</italic></sub>. On the other hand, <italic>x</italic><sub><italic>m</italic></sub> is much possible to be the BPNP of <italic>x</italic><sub><italic>i</italic></sub> because <italic>x</italic><sub><italic>i</italic></sub> lies around the region that the gray arrow indicates. Therefore, we consider the bi-directional distance <inline-formula><mml:math id="M17"><mml:mi>D</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> in Step 2. This (squared) bi-directional distance can resolve intersection problem in trajectory analysis.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>The direction of &#x003B3;<sub><italic>i</italic></sub> is indicated by the black arrow. The directions of &#x003C3;<sub><italic>m</italic></sub> and &#x003C3;<sub><italic>n</italic></sub> are indicated by the gray arrow and the red arrow, respectively. <inline-formula><mml:math id="M18"><mml:msup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, the (squared) distance between <italic>x</italic><sub><italic>i</italic></sub> and the end of gray arrow, is much smaller than <inline-formula><mml:math id="M19"><mml:msup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. Hence <italic>x</italic><sub><italic>m</italic></sub> is a better prediction than <italic>x</italic><sub><italic>n</italic></sub>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1124091-g0001.tif"/>
</fig>
<p>On the other hand, to prevent the endpoint of a trajectory connecting to another vessel, we have to impose a turning angle condition, which involves both space and time information. Roughly speaking, if the trajectory has to make a sudden unreasonable turn to connect its BPNP, then the trajectory should terminate right there. We cannot just measure the spatial angle because a vessel sometimes makes a large turn in a reasonable time period. So we need to consider a spatiotemporal angle. However, there is no natural exchange rate for temporal and spatial scales and we shall define a suitable one.</p>
<p>It is important to balance the scales of different spatiotemporal features for obtaining a meaningful space-time distance. The pooled normalization (a feature&#x00027;s values dividing by the range) and standardization (a feature&#x00027;s values divided by its standard error) are not suitable here, since the ranges of the spatiotemporal features of the vessels vary a lot. Consequently, we propose a dynamic scale conversion rate according to the vessel&#x00027;s speed and direction.</p>
<p>Considering that 1 knot is about 5&#x000B7;10<sup>&#x02212;4</sup> km/sec and the length of the diagonal of a <italic>longitude unit square</italic> is about 124.45 km in our data set, we choose &#x003C4; &#x0003D; 4&#x000B7;10<sup>&#x02212;6</sup> &#x02248; 5&#x000B7;10<sup>&#x02212;4</sup>/124.45 and &#x003B1; &#x0003D; 110.57/(111.32&#x000B7;cos&#x003B8;), which are ratio estimators [<xref ref-type="bibr" rid="B20">20</xref>] to resale the data. The scaling factor &#x003B1; is used to convert unit of distance from degree of latitude into degree longitude so that they are comparable; the factor &#x003C4; is used to normalize the time scale so that the temporal number looks in similar scale as spacial distance. Namely, for any two AIS points <italic>x</italic><sub><italic>i</italic></sub> and <italic>x</italic><sub><italic>j</italic></sub>, the spatiotemporal vector form <italic>x</italic><sub><italic>i</italic></sub> to <italic>x</italic><sub><italic>j</italic></sub> is defined by</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M20"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mover class="overrightarrow"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x020D7;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x000B7;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext>T</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mtext>T</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x003B1;</mml:mi><mml:mo>&#x000B7;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext>LAT</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mtext>LAT</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mtext>LON</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x02003;&#x02003;</mml:mtext><mml:mo>-</mml:mo><mml:mtext>LON</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where T(<italic>x</italic>) means the time of the AIS point <italic>x</italic>. When the angle &#x003C6; between <inline-formula><mml:math id="M22"><mml:mover class="overrightarrow"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x020D7;</mml:mo></mml:mover></mml:math></inline-formula> and &#x003B3;<sub><italic>i</italic></sub> is too large, say cos &#x003C6; &#x0003C; 0.1, then we remove <italic>x</italic><sub><italic>j</italic></sub> of candidates of BPNP of <italic>x</italic><sub><italic>i</italic></sub>.</p>
<p>Definition 2.1 (Turning Angle Condition). The trajectory shall not make a sudden and unreasonable turn in which the spatiotemporal angle &#x003C6; is greater than cos<sup>&#x02212;1</sup>(0.1).</p>
<p>Thus we obtain <inline-formula><mml:math id="M23"><mml:mover accent="false"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula> from <inline-formula><mml:math id="M24"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">N</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> in Step 2. On the other hand, steady vessels with very slow movement which may be anchored float around with water currents and thus have randomly changing courses [<xref ref-type="bibr" rid="B21">21</xref>]. Therefore, for those steady pairs <italic>x</italic><sub><italic>i</italic></sub> and <italic>x</italic><sub><italic>j</italic></sub> with average speed smaller than 0.15 knots [<xref ref-type="bibr" rid="B22">22</xref>], we do not use the forward-backward distance and simply measure their <italic>D</italic>(<italic>x</italic><sub><italic>i</italic></sub>, <italic>x</italic><sub><italic>j</italic></sub>) by <inline-formula><mml:math id="M25"><mml:msup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. The spatiotemporal vector representation in Equation (2) of the AIS points induces a linear model for the next point <inline-formula><mml:math id="M26"><mml:msubsup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. Suppose that at the current time <italic>t</italic><sub>0</sub> the point is <italic>x</italic><sub><italic>i</italic></sub> and at time <italic>s</italic> &#x02208; (<italic>t</italic><sub>0</sub>, <italic>t</italic>) the point is <inline-formula><mml:math id="M27"><mml:msubsup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, and <inline-formula><mml:math id="M28"><mml:mover class="overrightarrow"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>&#x020D7;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, we use the current speed speed(<italic>x</italic><sub><italic>i</italic></sub>) and angel &#x003B8;(<italic>x</italic><sub><italic>i</italic></sub>) to approximate the dynamic speed and angle from time <italic>t</italic><sub>0</sub> to time <italic>s</italic> so that the moving distance that has the true value from an integral of the dynamic speed over time (<italic>t</italic><sub>0</sub>, <italic>s</italic>) is estimated by the product of moving time and speed, and <italic>y</italic><sub>2</sub> and <italic>y</italic><sub>3</sub> come from the first-order Taylor polynomial of the angle around <italic>x</italic><sub><italic>i</italic></sub> that is used for cosine and sine. Consequently, we have the following regression models</p>
<disp-formula id="E4"><label>(3)</label><mml:math id="M29"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x02003;</mml:mtext><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>&#x003B1;</mml:mi><mml:mo>&#x000B7;</mml:mo><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mtext>LAT</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x02003;</mml:mtext><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mtext>LON</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M30"><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mtext>T</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mtext>T</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M31"><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mtext>LAT</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000B7;</mml:mo><mml:mtext>speed</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x000B7;</mml:mo><mml:mo class="qopname">cos</mml:mo><mml:mi>&#x003B8;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mtext>LAT</mml:mtext></mml:mrow></mml:msubsup></mml:math></inline-formula>, <inline-formula><mml:math id="M32"><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mtext>LON</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000B7;</mml:mo><mml:mtext>speed</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x000B7;</mml:mo><mml:mo class="qopname">sin</mml:mo><mml:mi>&#x003B8;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mtext>LON</mml:mtext></mml:mrow></mml:msubsup></mml:math></inline-formula>, and <inline-formula><mml:math id="M33"><mml:msubsup><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mtext>LAT</mml:mtext></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M34"><mml:msubsup><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mtext>LON</mml:mtext></mml:mrow></mml:msubsup></mml:math></inline-formula> are white noises that can be viewed as the errors from the linear approximations of the speed and angle.</p>
<p>In pursuit of better performance, we consider different values of parameter &#x003C4; according to the types of vessels. However, the types of vessels are not provided in our AIS data. Alternatively, we adjust the value of &#x003C4; based on the speed of vessels. This method is essentially a ratio estimator in cluster sampling [<xref ref-type="bibr" rid="B23">23</xref>]. For faster vessels with speed larger than 4 knots, we use larger &#x003C4;, 4&#x000B7;10<sup>&#x02212;5</sup>, so that the time difference is scaled to be comparable to the spatial difference in <inline-formula><mml:math id="M35"><mml:mover class="overrightarrow"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x020D7;</mml:mo></mml:mover></mml:math></inline-formula>. For slower vessels with speed smaller than four knots, we use &#x003C4; &#x0003D; 4&#x000B7;10<sup>&#x02212;5</sup> as proposed in the above paragraph. To demonstrate the performance of CBTR on different types of vessels, we present individual results of four categories of vessels according to their speeds: (1) <italic>x</italic><sub><italic>i</italic></sub> and <italic>x</italic><sub><italic>j</italic></sub> are called a <italic>high speed pair</italic> if the average speed (in knots) <italic>S</italic> of them is larger than or equal to 16 knots; (2) <italic>fast pair</italic> if 4 &#x02264; <italic>S</italic> &#x0003C; 16; (3) <italic>slow pair</italic> if 0.15 &#x02264; <italic>S</italic> &#x0003C; 4; (4) <italic>steady pair</italic> if <italic>S</italic> &#x0003C; 0.15. We use &#x003C4; &#x0003D; 4&#x000B7;10<sup>&#x02212;5</sup> for vessels of the fist two categories, which have faster speeds, and use &#x003C4; &#x0003D; 4&#x000B7;10<sup>&#x02212;6</sup> for vessels of the last two categories. The results are shown in <bold>Table 5</bold>.</p>
<p>The proposed CBTR algorithm can be viewed as a special case of the weighted-average plug-in classifier [<xref ref-type="bibr" rid="B24">24</xref>, <xref ref-type="bibr" rid="B25">25</xref>], with weights given by <italic>w</italic><sub><italic>i</italic></sub>(<italic>x</italic>) &#x0003D; 1/<italic>k</italic> if <italic>x</italic><sub><italic>i</italic></sub> is one of the <italic>k</italic> nearest neighbors of <italic>x</italic> in the search range <italic>S</italic>, and <italic>w</italic><sub><italic>i</italic></sub>(<italic>x</italic>) &#x0003D; 0 otherwise. Stone&#x00027;s theorem establishes consistency of the proposed clustering method provided that the weights satisfy certain conditions [<xref ref-type="bibr" rid="B26">26</xref>].</p>
</sec>
</sec>
</sec>
<sec id="s3">
<title>3. Results of data analysis</title>
<sec>
<title>3.1. Comparison of algorithms</title>
<p>We evaluated the results by the correct-neighbor rate that is defined as <inline-formula><mml:math id="M36"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:mi>I</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mi>n</mml:mi><mml:mo>,</mml:mo></mml:math></inline-formula> where <italic>Y</italic><sub><italic>j</italic></sub> is the label of the closest neighbor of <italic>Y</italic><sub><italic>i</italic></sub>. In the CBTR algorithm, every AIS point is either assigned a BPNP or determined as an endpoint of a trajectory. We sum up all mistakes made in this process, say <italic>M</italic>, and compute the correct-neighbor rate as 1&#x02212;<italic>M</italic>/<italic>n</italic>. The proposed method does not aim to find a correct sequential pattern of a trajectory. The definition of the accuracy used in this article only considers the correct clustered labels in the beginning of Section 3.1. It means that it is possible that the proposed method groups one vessel&#x00027;s AIS points in the order of (1, 3, 2) although the true order is (1, 2, 3). However, in this case, it is considered as a correct clustering result.</p>
<p>We compare the CBTR with other methods including the LSTM recurrent neural network (RNN) architecture [<xref ref-type="bibr" rid="B27">27</xref>&#x02013;<xref ref-type="bibr" rid="B29">29</xref>] and the EM clustering algorithm [<xref ref-type="bibr" rid="B30">30</xref>, <xref ref-type="bibr" rid="B31">31</xref>] which assumes mixed Gaussian distributed clusters. The Expectation-Maximization (EM) algorithm using a Gaussian mixture model estimates the probability of each observation iteratively through the E-step and M-step. Each EM cluster is determined by its mean and variance, so that it is suitable for vessels that are anchored or moving randomly in a fixed location. Since most vessels are moving with varying speeds and directions, the EM clustering does not perform well in the datasets. The comparisons of their correct-neighbor rates are listed in <xref ref-type="table" rid="T2">Tables 2</xref>, <xref ref-type="table" rid="T3">3</xref>.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>The correct-neighbor rates for each method of the AIS data with speed &#x0003E;3 knots in the three datasets.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Methods</bold></th>
<th valign="top" align="center"><bold>Set 1</bold></th>
<th valign="top" align="center"><bold>Set 2</bold></th>
<th valign="top" align="center"><bold>Set 3</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">NPC classification</td>
<td valign="top" align="center">0.9942</td>
<td valign="top" align="center">0.9881</td>
<td valign="top" align="center">0.9942</td>
</tr> <tr>
<td valign="top" align="left">NPC clustering</td>
<td valign="top" align="center">0.9732</td>
<td valign="top" align="center">0.9481</td>
<td valign="top" align="center">0.9842</td>
</tr> <tr>
<td valign="top" align="left">CBTR</td>
<td valign="top" align="center">0.9986</td>
<td valign="top" align="center">0.9982</td>
<td valign="top" align="center">0.9973</td>
</tr> <tr>
<td valign="top" align="left">LSTM</td>
<td valign="top" align="center">0.6580</td>
<td valign="top" align="center">0.6749</td>
<td valign="top" align="center">0.6534</td>
</tr> <tr>
<td valign="top" align="left">EM clustering</td>
<td valign="top" align="center">0.1580</td>
<td valign="top" align="center">0.1749</td>
<td valign="top" align="center">0.1643</td>
</tr></tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>The correct-neighbor rates for each method of the AIS data with speed  &#x02264; 3 knots in the three datasets.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Methods</bold></th>
<th valign="top" align="center"><bold>Set 1</bold></th>
<th valign="top" align="center"><bold>Set 2</bold></th>
<th valign="top" align="center"><bold>Set 3</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">NPC classification</td>
<td valign="top" align="center">0.9942</td>
<td valign="top" align="center">0.9881</td>
<td valign="top" align="center">0.9942</td>
</tr> <tr>
<td valign="top" align="left">NPC clustering</td>
<td valign="top" align="center">0.9732</td>
<td valign="top" align="center">0.9481</td>
<td valign="top" align="center">0.9842</td>
</tr> <tr>
<td valign="top" align="left">CBTR</td>
<td valign="top" align="center">0.9986</td>
<td valign="top" align="center">0.9982</td>
<td valign="top" align="center">0.9973</td>
</tr> <tr>
<td valign="top" align="left">LSTM</td>
<td valign="top" align="center">0.6580</td>
<td valign="top" align="center">0.6749</td>
<td valign="top" align="center">0.6534</td>
</tr> <tr>
<td valign="top" align="left">EM clustering</td>
<td valign="top" align="center">0.1580</td>
<td valign="top" align="center">0.1749</td>
<td valign="top" align="center">0.1643</td>
</tr></tbody>
</table>
</table-wrap>
<p>The time complexity of the proposed CBTR method is <italic>O</italic>(<italic>nr</italic>) with the sample size <italic>n</italic> and the neighborhood size <italic>r</italic>. See the computational time for each method in <xref ref-type="table" rid="T4">Table 4</xref>.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>The computational time for each method in the three datasets in seconds.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Methods</bold></th>
<th valign="top" align="center"><bold>Set 1</bold></th>
<th valign="top" align="center"><bold>Set 2</bold></th>
<th valign="top" align="center"><bold>Set 3</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">NPC classification</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">27</td>
<td valign="top" align="center">23</td>
</tr> <tr>
<td valign="top" align="left">NPC clustering</td>
<td valign="top" align="center">25</td>
<td valign="top" align="center">26</td>
<td valign="top" align="center">27</td>
</tr> <tr>
<td valign="top" align="left">CBTR</td>
<td valign="top" align="center">19</td>
<td valign="top" align="center">26</td>
<td valign="top" align="center">17</td>
</tr> <tr>
<td valign="top" align="left">LSTM</td>
<td valign="top" align="center">278</td>
<td valign="top" align="center">405</td>
<td valign="top" align="center">262</td>
</tr> <tr>
<td valign="top" align="left">EM clustering</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">31</td>
<td valign="top" align="center">27</td>
</tr></tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>3.2. Results of CBTR and experiments by sampling</title>
<p>In <xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref>, one sees that CBTR is able to regroup most of the trajectories correctly. We leave the detailed explanation of these plots in the <xref ref-type="supplementary-material" rid="SM1">Supplementary material</xref>. One may evaluate the performance of CBTR by two numbers: jumps and merges. The former counts the total breaks of trajectories done wrongly by CBTR and the later counts how many wrong groupings CBTR makes. Instead of counting how many points are connected to wrong next point, the sum of jumps and merges shows the performance of CBTR more faithfully. Since each jump creates a new clustering and each merge cancels a group, the difference between them is exactly the difference between the number of vessels of our data and the number of clusters <italic>via</italic> CBTR. Namely we have the following identity: merges &#x02212; jumps &#x0003D; &#x00023;{predicted clusters}&#x02212;&#x00023;{actual number of vessels}.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>The clustering results of data set 1. The numbers in the horizontal axis are ordered by the vessels&#x00027; VIDs. The red lines are the predicted labels and the blue lines are the true labels. Most points of vessel no. 7 are merged with vessel no. 5 and the rest points are split into another cluster independent from other vessels. Vessel no. 15 is merged with vessel no. 5 and no. 6, and contributes 2 jumps.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1124091-g0002.tif"/>
</fig>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>The clustering results of data set 2 and data set 3.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1124091-g0003.tif"/>
</fig>
<p>In order to evaluate the robustness of CBTR, we conducted experiments by removing points in the data sets so that the trajectories become harder to be tracked. Indeed, we consider validation sets by method 1: removing each fifth point of every five points (i.e., the fifth, tenth, etc.) and method 2 removing each second point of every two points (i.e., the second, fourth, etc.). In sum, we take out 20 and 50% points, respectively, in each validation set and apply CBTR to predict the trajectories. For the downsampled AIS datasets 1, 2, 3 using method 1, the correct rates of the estimated neighbors are 0.9977, 0.9977, and 0.9966, respectively. For the downsampled AIS datasets 1, 2, 3 using method 2, the correct rates of estimated neighbors are 0.9947, 0.9943, and 0.9913, respectively. As we anticipated, the more points are removed, the lower the correct rates of estimated neighbors are. However, CBTR still performs very well whereas large amounts of points are removed. Furthermore, we remark that there is a trade-off between the reduction of the number of jumps and the increment of the number of mergers. If the upper bound for time interval is lager than 1,000 in Step 1, it may lead to more candidates used for selecting BPNP and fewer jumping points while increases the number of merges.</p>
</sec>
<sec>
<title>3.3. Discussion on the performance of CBTR</title>
<p>The predicted trajectories of all vessels in data set 1 by using CBTR are shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. From the left-hand boundary of this picture, we know the data set contains some incomplete trajectories and it is impossible to cluster them correctly. One can see a zoomed-in picture of this boundary phenomenon in <xref ref-type="fig" rid="F5">Figure 5</xref>.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>The predicted trajectories of all vessels in data set 1 by using CBTR. Points are colored and lined according to clusters and are numbered by the actual VIDs. Two red boxed regions will be enlarged in the following figures. Most of the trajectories are clustered correctly. The only visible errors in this picture are the merge of vessels no. 1 and no. 19 and the merge of vessels no. 13 and no. 20 (in the middle left). Details in the two red boxes are shown as <xref ref-type="fig" rid="F5">Figures 5</xref>, <xref ref-type="fig" rid="F6">6</xref>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1124091-g0004.tif"/>
</fig>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Points are colored and lined according to clusters and are numbered by the actual VIDs. The merge of vessel no. 19 to vessel no. 1 is due to the limit of boundary of the dataset. Vessel no. 13 is connected to no. 20, which is outside the plot, due to the same boundary effect.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1124091-g0005.tif"/>
</fig>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>The predicted trajectories of vessel no. 5&#x02013;8 and no. 14&#x02013;16. Points are colored and lined according to clusters and are numbered by the actual VIDs. Points of vessel no. 14 are merged to others due to their problematic location. On the other hand, the endpoint of vessel no. 5 is incorrectly connected to a point of vessel no. 7. Vessel no. 8, which is steady, is perfectly clustered.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1124091-g0006.tif"/>
</fig>
<p><xref ref-type="fig" rid="F6">Figure 6</xref> shows another mistake made by CBTR. This kind of mistakes happens at the endpoint of some trajectories. To be precise, when a vessel goes back and parks at a pier, it will turn off the AIS signal transmitter. The last position reported shall be the endpoint of the trajectory. But sometimes CBTR finds a false next point for this endpoint and continues the trial. For example, in <xref ref-type="fig" rid="F6">Figure 6</xref>, vessel no. 7 (colored purple) left toward west, came back, and parked to the east of vessel no. 5. At that moment, vessel no. 5 was reporting its last location before turning off its signal transmitter. CBTR found some point of vessel no. 7 to be a possible next point of the last point of vessel no. 5. So it makes a wrong connection from the circled point to the squared point. This is called a terminal-type mistake and counted as a merge.</p>
<p>These terminal-type mistakes only happen when two vessels are anchored close to each other. We can prevent this terminal-type mistakes by using more restrictive connecting criterion, but this will break some trajectories of moving vessels because AIS points in moving trajectories are much sparser than AIS points in steady vessels moored to the piers. In this case, the speed and angle of a vessel randomly change by wave drift forces, so the variances of the white noises in our model (3) may be larger than the signals (speed). These terminal-type mistakes are not that serious because the AIS data is mainly used to recognize moving vessels. Except the boundary phenomenon and the terminal-type mistakes, CBTR performs perfectly on generic situations and is a reliable method to predict trajectories. <xref ref-type="table" rid="T5">Table 5</xref> shows the performance of CBTR for vessels with different speeds. One can see that slow vessels are the most challenging ones for CBTR.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Performance of vessels of different types.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Types</bold></th>
<th valign="top" align="center"><bold>Set 1</bold></th>
<th valign="top" align="center"><bold>Set 2</bold></th>
<th valign="top" align="center"><bold>Set 3</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Overall</td>
<td valign="top" align="center">0.9986</td>
<td valign="top" align="center">0.9985</td>
<td valign="top" align="center">0.9974</td>
</tr> <tr>
<td valign="top" align="left">High speed</td>
<td valign="top" align="center">0.9987</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.9994</td>
</tr> <tr>
<td valign="top" align="left">Fast</td>
<td valign="top" align="center">0.9994</td>
<td valign="top" align="center">0.9996</td>
<td valign="top" align="center">0.9992</td>
</tr> <tr>
<td valign="top" align="left">Slow</td>
<td valign="top" align="center">0.9990</td>
<td valign="top" align="center">0.9882</td>
<td valign="top" align="center">0.9763</td>
</tr> <tr>
<td valign="top" align="left">Steady</td>
<td valign="top" align="center">0.9983</td>
<td valign="top" align="center">0.9982</td>
<td valign="top" align="center">0.9972</td>
</tr></tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>3.4. Discussion on the performance of the LSTM path prediction</title>
<p>Long Short-Term Memory [<xref ref-type="bibr" rid="B27">27</xref>] is a type of Recurrent Neural Networks (RNNs). LSTMs are an example of a recurrent neural network which has feedback loops allowing time-dependent problems to be solved. That is, the outputs (i.e., previous outputs) can be used as an input to help model the current output. More generally, problems that have a fundamental order can be solved. LSTMs are capable of modeling sequences of different lengths, and this is ideal as vessel paths often have a different number of points [<xref ref-type="bibr" rid="B32">32</xref>].</p>
<p>LSTMs have been used for predicting vessel trajectories with AIS data [<xref ref-type="bibr" rid="B33">33</xref>&#x02013;<xref ref-type="bibr" rid="B35">35</xref>] as they can naturally be adapted to multi-target learning and are capable of learning both simple and complex patterns. Here we can think of the timestamp, latitude, longitude, speed, and direction, all at time <italic>t</italic>, as response variables whereas the predictor variables (i.e., inputs to the LSTM) are the timestamp, latitude, longitude, speed, and direction at time <italic>t</italic>&#x02212;1, <italic>t</italic>&#x02212;2, &#x022EF;&#x02009;, <italic>t</italic>&#x02212;<italic>k</italic>. We train an LSTM using lagged versions of the timestamp, latitude, longitude, speed, and direction (i.e., time <italic>t</italic>&#x02212;1, <italic>t</italic>&#x02212;2.&#x022EF;&#x02009;, <italic>t</italic>&#x02212;<italic>k</italic>) in order to predict the timestamp, latitude, longitude, speed, and direction at one time point in the future (i.e., time <italic>t</italic>). The goal here is to attempt to predict all characteristics of a vessel automatically using previous information. The architecture and tuning were accomplished via trial an error using a random 20% validation sample.</p>
<p>The characteristics of the LSTM are the following: an input dimension of 5 (i.e., timestamp, latitude, longitude, speed, and direction are lagged by <italic>k</italic> &#x0003D; 1 time unit), 1 hidden layer, 250 hidden units using the Rectified Linear Unit (ReLU) activation function: max(0, <italic>x</italic>), and 5 output nodes (i.e., timestamp, latitude, longitude, speed, and direction at time <italic>t</italic>). Additional values for the number of lags were tried, but the performance was essentially unchanged and different activation functions were tried and tended to produce inferior results. The software used was the keras library in Python [<xref ref-type="bibr" rid="B36">36</xref>].</p>
<p>The results from the LSTM using all five variables as outputs seem to indicate that this approach is unable to distinguish the different vessel trajectories due to several reasons including the initial value and the training set of LSTM [<xref ref-type="bibr" rid="B33">33</xref>], the changes of courses and speeds [<xref ref-type="bibr" rid="B34">34</xref>] in the given prediction time range, and the normalization method <inline-formula><mml:math id="M37"><mml:mfrac><mml:mrow><mml:mi>x</mml:mi><mml:mo>-</mml:mo><mml:mo class="qopname">min</mml:mo></mml:mrow><mml:mrow><mml:mo class="qopname">max</mml:mo><mml:mo>-</mml:mo><mml:mo class="qopname">min</mml:mo></mml:mrow></mml:mfrac></mml:math></inline-formula> [<xref ref-type="bibr" rid="B35">35</xref>] which may over-compress trajectory data since the some trajectories have large ranges but others do not.</p>
<p>The performance of the LSTM next point prediction method is fundamentally dependent on the historical trajectories with labels used to train the LSTM model to predict the properties of the node at the next time point [<xref ref-type="bibr" rid="B32">32</xref>, <xref ref-type="bibr" rid="B37">37</xref>]. That is, the training set with labeled trajectories are needed to accurately predict the timestamp, latitude, longitude, speed, and direction at some future point in time. However, to make a fair comparison, only the current AIS point is used in training a LSTM model for predicting the next point, and this makes the recurrent neurons not able to sufficiently learn the latent features in the AIS datasets and leads to inaccurate prediction [<xref ref-type="bibr" rid="B38">38</xref>, <xref ref-type="bibr" rid="B39">39</xref>]. LSTM models are known to require a large amount of data in order to be effective, so the relatively small size of the individual AIS training datasets also is a contributing factor to the LSTM&#x00027;s performance.</p>
<p>An inspection of the LSTM predictions and the resulting nearest neighbor search indicate that most of the errors are related primarily to two factors: some vessels rapidly change their speed and direction while simultaneously other vessels that were previously similar to the rapidly changing vessel do not change their speed or direction suddenly and this results in misclassification, for example, vessels no. 5&#x02013;8. The second source of error may be that the predicted AIS points by LSTM have large variations [<xref ref-type="bibr" rid="B37">37</xref>] and in combination with a larger number of candidates within each time window (i.e., the time window in the nearest neighbor search), mistakes are accumulated.</p>
</sec>
</sec>
<sec sec-type="conclusions" id="s4">
<title>4. Conclusions</title>
<p>The proposed CBTR method successfully cluster AIS points and track a trajectory without knowing the true labels of AIS points. Step 2 of the proposed CBTR is the essence of our method, which integrates the forward and backward estimated positions into measuring the differences between two adjacent points. This step evaluates how good the fitted path is dynamically instead of using the static point information by measuring the mutual distances between points. Thus, CBTR algorithm is able to distinguish intersecting trajectories. The second feature in Step 2 is to define a suitable parameter &#x003C4; to exchange time and space scales. Therefore, CBTR is applicable to various kinds of moving-point data lacking in labels, and its spatiotemporal features can be used with other methods [<xref ref-type="bibr" rid="B40">40</xref>] to select a safe maneuver crossing scenario with two target ships.</p>
</sec>
<sec sec-type="data-availability" id="s5">
<title>Data availability statement</title>
<p>The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: <ext-link ext-link-type="uri" xlink:href="https://gitlab.com/algorithms-for-threat-detection/2019/atd2019">https://gitlab.com/algorithms-for-threat-detection/2019/atd2019</ext-link>.</p>
</sec>
<sec sec-type="author-contributions" id="s6">
<title>Author contributions</title>
<p>H-HH and C-WC reviewed literature and designed the proposed methods. C-WC wrote and ran the Matlab code for the proposed CBTR method and edited the proposed methodology and the analysis results. H-HH drafted the manuscript. All authors proofread the manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<sec sec-type="funding-information" id="s7">
<title>Funding</title>
<p>This research was supported in part by the National Science Foundation grant, DMS-1924792 and MOST Young Scholar Fellowship Program, grant no. NSTC 112-2636-M-110-007.</p>
</sec>
<ack><p>C-WC thanks the support from National Center for Theoretical Sciences in Taiwan.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s8">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec sec-type="supplementary-material" id="s9">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fams.2023.1124091/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fams.2023.1124091/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Presentation_1.zip" id="SM1" mimetype="application/zip" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>

<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mankabady</surname> <given-names>S</given-names></name></person-group>. <source>The International Maritime Organization, Volume 1: International Shipping Rules</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name> (<year>1986</year>).</citation>
</ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Natale</surname> <given-names>F</given-names></name> <name><surname>Gibin</surname> <given-names>M</given-names></name> <name><surname>Alessandrini</surname> <given-names>A</given-names></name> <name><surname>Vespe</surname> <given-names>M</given-names></name> <name><surname>Paulrud</surname> <given-names>A</given-names></name></person-group>. <article-title>Mapping fishing effort through AIS data</article-title>. <source>PLoS ONE.</source> (<year>2015</year>) <volume>10</volume>:<fpage>e0130746</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0130746</pub-id><pub-id pub-id-type="pmid">26098430</pub-id></citation></ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Mercer</surname> <given-names>D,. Algorithms for Threat Detection 2019 Challenge AIS Data.</given-names></name></person-group> (<year>2019</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://gitlab.com/algorithms-for-threat-detection/2019/atd2019">https://gitlab.com/algorithms-for-threat-detection/2019/atd2019</ext-link> (accessed October 28, 2020).</citation>
</ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bautista-S&#x000E1;nchez</surname> <given-names>R</given-names></name> <name><surname>Barbosa-Santillan</surname> <given-names>LI</given-names></name> <name><surname>S&#x000E1;nchez-Escobar</surname> <given-names>JJ</given-names></name></person-group>. <article-title>Method for select best AIS data in prediction vessel movements and route estimation</article-title>. <source>Appl Sci</source>. (<year>2021</year>) <volume>11</volume>:<fpage>2429</fpage>. <pub-id pub-id-type="doi">10.3390/app11052429</pub-id></citation>
</ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mehrholz</surname> <given-names>D</given-names></name> <name><surname>Leushacke</surname> <given-names>L</given-names></name> <name><surname>Flury</surname> <given-names>W</given-names></name> <name><surname>Jehn</surname> <given-names>R</given-names></name> <name><surname>Klinkrad</surname> <given-names>H</given-names></name> <name><surname>Landgraf</surname> <given-names>M</given-names></name></person-group>. <article-title>Detecting, tracking and imaging space debris</article-title>. <source>ESA Bull</source>. (<year>2002</year>) <fpage>128</fpage>&#x02013;<lpage>34</lpage>.</citation>
</ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Young</surname> <given-names>BL</given-names></name></person-group>. <source>Predicting Vessel Trajectories From AIS Data Using R</source>. <publisher-loc>Monterey, CA</publisher-loc>: <publisher-name>Naval Postgraduate School Monterey</publisher-name> (<year>2017</year>).</citation>
</ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Best</surname> <given-names>RA</given-names></name> <name><surname>Norton</surname> <given-names>JP</given-names></name></person-group>. <article-title>A new model and efficient tracker for a target with curvilinear motion</article-title>. <source>IEEE Trans Aerospace Electron Syst</source>. (<year>1997</year>) <volume>33</volume>:<fpage>1030</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1109/TAES.1997.599329</pub-id></citation>
</ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perera</surname> <given-names>LP</given-names></name> <name><surname>Oliveira</surname> <given-names>P</given-names></name> <name><surname>Soares</surname> <given-names>CG</given-names></name></person-group>. <article-title>Maritime traffic monitoring based on vessel detection, tracking, state estimation, and trajectory prediction</article-title>. <source>IEEE Trans Intell Transp Syst</source>. (<year>2012</year>) <volume>13</volume>:<fpage>1188</fpage>&#x02013;<lpage>200</lpage>. <pub-id pub-id-type="doi">10.1109/TITS.2012.2187282</pub-id></citation>
</ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schubert</surname> <given-names>R</given-names></name> <name><surname>Richter</surname> <given-names>E</given-names></name> <name><surname>Wanielik</surname> <given-names>G</given-names></name></person-group>. <article-title>Comparison and evaluation of advanced motion models for vehicle tracking</article-title>. In: <source>2008 11th International Conference on Information Fusion.</source> <publisher-loc>Cologne</publisher-loc> (<year>2008</year>). p. <fpage>1</fpage>&#x02013;<lpage>6</lpage>.</citation>
</ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nguyen</surname> <given-names>D</given-names></name> <name><surname>Vadaine</surname> <given-names>R</given-names></name> <name><surname>Hajduch</surname> <given-names>G</given-names></name> <name><surname>Garello</surname> <given-names>R</given-names></name> <name><surname>Fablet</surname> <given-names>R</given-names></name></person-group>. <article-title>A multi-task deep learning architecture for maritime surveillance using AIS data streams</article-title>. In: <source>2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)</source>. <publisher-loc>Turin</publisher-loc> (<year>2018</year>). p. <fpage>331</fpage>&#x02013;<lpage>40</lpage>. <pub-id pub-id-type="doi">10.1109/DSAA.2018.00044</pub-id></citation>
</ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Millefiori</surname> <given-names>LM</given-names></name> <name><surname>Braca</surname> <given-names>P</given-names></name> <name><surname>Bryan</surname> <given-names>K</given-names></name> <name><surname>Willett</surname> <given-names>PK</given-names></name></person-group>. <article-title>Modeling vessel kinematics using a stochastic mean-reverting process for long-term prediction</article-title>. <source>IEEE Trans Aerospace Electron Syst</source>. (<year>2016</year>) <volume>52</volume>:<fpage>2313</fpage>&#x02013;<lpage>30</lpage>. <pub-id pub-id-type="doi">10.1109/TAES.2016.150596</pub-id></citation>
</ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pallotta</surname> <given-names>G</given-names></name> <name><surname>Horn</surname> <given-names>S</given-names></name> <name><surname>Braca</surname> <given-names>P</given-names></name> <name><surname>Bryan</surname> <given-names>KB</given-names></name></person-group>. <article-title>Context-enhanced vessel prediction based on ornstein-uhlenbeck processes using historical AIS traffic patterns: real-world experimental results</article-title>. In: <source>17th International Conference on Information Fusion</source>. Vol. 4. Salamanca (<year>2014</year>). p. 213.</citation>
</ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mazzarella</surname> <given-names>F</given-names></name> <name><surname>Arguedas</surname> <given-names>VF</given-names></name> <name><surname>Vespe</surname> <given-names>M</given-names></name></person-group>. <article-title>Knowledge-based vessel position prediction using historical AIS data</article-title>. In: <source>2015 Sensor Data Fusion: Trends, Solutions, Applications (SDF)</source>. <publisher-loc>Bonn</publisher-loc> (<year>2015</year>). p. <fpage>1</fpage>&#x02013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1109/SDF.2015.7347707</pub-id></citation>
</ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hexeberg</surname> <given-names>S</given-names></name> <name><surname>Flaten</surname> <given-names>AL</given-names></name> <name><surname>Eriksen</surname> <given-names>BOH</given-names></name> <name><surname>Brekke</surname> <given-names>EF</given-names></name></person-group>. <article-title>AIS-based vessel trajectory prediction</article-title>. In: <source>2017 20th International Conference on Information Fusion (Fusion)</source>. Xi&#x00027;an (<year>2017</year>). p. <fpage>1</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.23919/ICIF.2017.8009762</pub-id></citation>
</ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coscia</surname> <given-names>P</given-names></name> <name><surname>Braca</surname> <given-names>P</given-names></name> <name><surname>Millefiori</surname> <given-names>LM</given-names></name> <name><surname>Palmieri</surname> <given-names>FAN</given-names></name> <name><surname>Willett</surname> <given-names>PK</given-names></name></person-group>. <article-title>multiple Ornstein-Uhlenbeck processes for maritime traffic graph representation</article-title>. <source>IEEE Trans Aerospace Electron Syst</source>. (<year>2018</year>) <volume>54</volume>:<fpage>2158</fpage>&#x02013;<lpage>70</lpage>. <pub-id pub-id-type="doi">10.1109/TAES.2018.2808098</pub-id></citation>
</ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>JG</given-names></name> <name><surname>Han</surname> <given-names>J</given-names></name> <name><surname>Whang</surname> <given-names>KY</given-names></name></person-group>. <article-title>Trajectory clustering: a partition-and-group framework</article-title>. In: <source>Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. SIGMOD&#x00027;07</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>ACM</publisher-name> (<year>2007</year>). p. <fpage>593</fpage>&#x02013;<lpage>604</lpage>. <pub-id pub-id-type="doi">10.1145/1247480.1247546</pub-id></citation>
</ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pallotta</surname> <given-names>G</given-names></name> <name><surname>Vespe</surname> <given-names>M</given-names></name> <name><surname>Bryan</surname> <given-names>K</given-names></name></person-group>. <article-title>Vessel pattern knowledge discovery from AIS data: a framework for anomaly detection and route prediction</article-title>. <source>Entropy</source>. (<year>2013</year>) <volume>15</volume>:<fpage>2218</fpage>&#x02013;<lpage>45</lpage>. <pub-id pub-id-type="doi">10.3390/e15062218</pub-id></citation>
</ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gower</surname> <given-names>JC</given-names></name> <name><surname>Ross</surname> <given-names>GJ</given-names></name></person-group>. <article-title>Minimum spanning trees and single linkage cluster analysis</article-title>. <source>J R Stat Soc Ser C</source>. (<year>1969</year>) <volume>18</volume>:<fpage>54</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.2307/2346439</pub-id></citation>
</ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>HH</given-names></name> <name><surname>Yu</surname> <given-names>C</given-names></name> <name><surname>Zheng</surname> <given-names>H</given-names></name> <name><surname>Hernandez</surname> <given-names>T</given-names></name> <name><surname>Yau</surname> <given-names>SC</given-names></name> <name><surname>He</surname> <given-names>RL</given-names></name> <etal/></person-group>. <article-title>Global comparison of multiple-segmented viruses in 12-dimensional genome space</article-title>. <source>Mol Phylogenet Evol</source>. (<year>2014</year>) <volume>81</volume>:<fpage>29</fpage>&#x02013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1016/j.ympev.2014.08.003</pub-id><pub-id pub-id-type="pmid">25172357</pub-id></citation></ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scott</surname> <given-names>A</given-names></name> <name><surname>Wu</surname> <given-names>CF</given-names></name></person-group>. <article-title>On the asymptotic distribution of ratio and regression estimators</article-title>. <source>J Am Stat Assoc</source>. (<year>1981</year>) <volume>76</volume>:<fpage>98</fpage>&#x02013;<lpage>102</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1981.10477612</pub-id></citation>
</ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Silber</surname> <given-names>GK</given-names></name> <name><surname>Adams</surname> <given-names>JD</given-names></name> <name><surname>Fonnesbeck</surname> <given-names>CJ</given-names></name></person-group>. <article-title>Compliance with vessel speed restrictions to protect North Atlantic right whales</article-title>. <source>PeerJ</source>. (<year>2014</year>) <volume>2</volume>:<fpage>e399</fpage>. <pub-id pub-id-type="doi">10.7717/peerj.399</pub-id><pub-id pub-id-type="pmid">24949229</pub-id></citation></ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Redoutey</surname> <given-names>M</given-names></name> <name><surname>Scotti</surname> <given-names>E</given-names></name> <name><surname>Jensen</surname> <given-names>C</given-names></name> <name><surname>Ray</surname> <given-names>C</given-names></name> <name><surname>Claramunt</surname> <given-names>C</given-names></name></person-group>. <article-title>Efficient vessel tracking with accuracy guarantees</article-title>. In: <source>International Symposium on Web and Wireless Geographical Information Systems</source>. <publisher-loc>London</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2008</year>). p. <fpage>140</fpage>&#x02013;<lpage>51</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-540-89903-7_13</pub-id></citation>
</ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dryver</surname> <given-names>AL</given-names></name> <name><surname>Chao</surname> <given-names>CT</given-names></name></person-group>. <article-title>Ratio estimators in adaptive cluster sampling</article-title>. <source>Environmetrics</source>. (<year>2007</year>) <volume>18</volume>:<fpage>607</fpage>&#x02013;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.1002/env.838</pub-id></citation>
</ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>J</given-names></name> <name><surname>Shen</surname> <given-names>X</given-names></name> <name><surname>Liu</surname> <given-names>Y</given-names></name></person-group>. <article-title>Probability estimation for large-margin classifiers</article-title>. <source>Biometrika</source>. (<year>2008</year>) <volume>95</volume>:<fpage>149</fpage>&#x02013;<lpage>67</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/asm077</pub-id></citation>
</ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y</given-names></name> <name><surname>Zhang</surname> <given-names>HH</given-names></name> <name><surname>Liu</surname> <given-names>Y</given-names></name></person-group>. <article-title>Robust model-free multiclass probability estimation</article-title>. <source>J Am Stat Assoc</source>. (<year>2010</year>) <volume>105</volume>:<fpage>424</fpage>&#x02013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1198/jasa.2010.tm09107</pub-id><pub-id pub-id-type="pmid">21113386</pub-id></citation></ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stone</surname> <given-names>CJ</given-names></name></person-group>. <article-title>Consistent nonparametric regression</article-title>. <source>Ann Stat</source>. (<year>1977</year>) <volume>5</volume>:<fpage>595</fpage>&#x02013;<lpage>620</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1176343886</pub-id></citation>
</ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hochreiter</surname> <given-names>S</given-names></name> <name><surname>Schmidhuber</surname> <given-names>J</given-names></name></person-group>. <article-title>Long short-term memory</article-title>. <source>Neural Comput</source>. (<year>1997</year>) <volume>9</volume>:<fpage>1735</fpage>&#x02013;<lpage>80</lpage>. <pub-id pub-id-type="doi">10.1162/neco.1997.9.8.1735</pub-id><pub-id pub-id-type="pmid">9377276</pub-id></citation></ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ding</surname> <given-names>M</given-names></name> <name><surname>Su</surname> <given-names>W</given-names></name> <name><surname>Liu</surname> <given-names>Y</given-names></name> <name><surname>Zhang</surname> <given-names>J</given-names></name> <name><surname>Li</surname> <given-names>J</given-names></name> <name><surname>Wu</surname> <given-names>J</given-names></name></person-group>. <article-title>A novel approach on vessel trajectory prediction based on variational LSTM</article-title>. In: <source>2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)</source>. <publisher-loc>Dalian</publisher-loc> (<year>2020</year>). p. <fpage>206</fpage>&#x02013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1109/ICAICA50127.2020.9182537</pub-id></citation>
</ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jin</surname> <given-names>J</given-names></name> <name><surname>Zhou</surname> <given-names>W</given-names></name> <name><surname>Jiang</surname> <given-names>B</given-names></name></person-group>. <article-title>Maritime target trajectory prediction model based on the RNN network</article-title>. In: <source>Artificial Intelligence in China</source>. <publisher-loc>Berlin</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2020</year>). p. <fpage>334</fpage>&#x02013;<lpage>42</lpage>. <pub-id pub-id-type="doi">10.1007/978-981-15-0187-6_39</pub-id></citation>
</ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dempster</surname> <given-names>AP</given-names></name> <name><surname>Laird</surname> <given-names>NM</given-names></name> <name><surname>Rubin</surname> <given-names>DB</given-names></name></person-group>. <article-title>Maximum likelihood from incomplete data via the EM algorithm</article-title>. <source>J R Stat Soc Ser B</source>. (<year>1977</year>) <volume>39</volume>:<fpage>1</fpage>&#x02013;<lpage>22</lpage>. <pub-id pub-id-type="doi">10.1111/j.2517-6161.1977.tb01600.x</pub-id></citation>
</ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>DB</given-names></name> <name><surname>Thayer</surname> <given-names>DT</given-names></name></person-group>. <article-title>EM algorithms for ML factor analysis</article-title>. <source>Psychometrika</source>. (<year>1982</year>) <volume>47</volume>:<fpage>69</fpage>&#x02013;<lpage>76</lpage>. <pub-id pub-id-type="doi">10.1007/BF02293851</pub-id></citation>
</ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Capobianco</surname> <given-names>S</given-names></name> <name><surname>Millefiori</surname> <given-names>LM</given-names></name> <name><surname>Forti</surname> <given-names>N</given-names></name> <name><surname>Braca</surname> <given-names>P</given-names></name> <name><surname>Willett</surname> <given-names>P</given-names></name></person-group>. <article-title>Deep learning methods for vessel trajectory prediction based on recurrent neural networks</article-title>. <source>IEEE Trans Aerospace Electron Syst</source>. (<year>2021</year>) <volume>57</volume>:<fpage>4329</fpage>&#x02013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1109/TAES.2021.3096873</pub-id></citation>
</ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname> <given-names>H</given-names></name> <name><surname>Yin</surname> <given-names>Y</given-names></name> <name><surname>Shen</surname> <given-names>H</given-names></name></person-group>. <article-title>A model for vessel trajectory prediction based on long short-term memory neural network</article-title>. <source>J Mar Eng Technol</source>. (<year>2019</year>) <volume>21</volume>:<fpage>136</fpage>&#x02013;<lpage>45</lpage>. <pub-id pub-id-type="doi">10.1080/20464177.2019.1665258</pub-id></citation>
</ref>
<ref id="B34">
<label>34.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Forti</surname> <given-names>N</given-names></name> <name><surname>Millefiori</surname> <given-names>LM</given-names></name> <name><surname>Braca</surname> <given-names>P</given-names></name> <name><surname>Willett</surname> <given-names>P</given-names></name></person-group>. <article-title>Prediction of vessel trajectories from AIS data via sequence-to-sequence recurrent neural networks</article-title>. In: <source>ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>. <publisher-loc>Barcelona</publisher-loc> (<year>2020</year>). p. <fpage>8936</fpage>&#x02013;<lpage>40</lpage>. <pub-id pub-id-type="doi">10.1109/ICASSP40776.2020.9054421</pub-id></citation>
</ref>
<ref id="B35">
<label>35.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Z</given-names></name> <name><surname>Ni</surname> <given-names>G</given-names></name> <name><surname>Xu</surname> <given-names>Y</given-names></name></person-group>. <article-title>Ship trajectory prediction based on LSTM neural network</article-title>. In: <source>2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC)</source>. <publisher-loc>Chongqing</publisher-loc> (<year>2020</year>). p. <fpage>1356</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1109/ITOEC49072.2020.9141702</pub-id></citation>
</ref>
<ref id="B36">
<label>36.</label>
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Chollet</surname> <given-names>F,. Deep Learning with Python. Simon Schuster</given-names></name></person-group> (<year>2021</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://github.com/keras-team/keras">https://github.com/keras-team/keras</ext-link></citation>
</ref>
<ref id="B37">
<label>37.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>DW</given-names></name> <name><surname>Zhu</surname> <given-names>YS</given-names></name> <name><surname>Zhang</surname> <given-names>JF</given-names></name> <name><surname>He</surname> <given-names>YK</given-names></name> <name><surname>Yan</surname> <given-names>K</given-names></name> <name><surname>Yan</surname> <given-names>BR</given-names></name></person-group>. <article-title>A novel MP-LSTM method for ship trajectory prediction based on AIS data</article-title>. <source>Ocean Eng</source>. (<year>2021</year>) <volume>228</volume>:<fpage>108956</fpage>. <pub-id pub-id-type="doi">10.1016/j.oceaneng.2021.108956</pub-id></citation>
</ref>
<ref id="B38">
<label>38.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sagheer</surname> <given-names>A</given-names></name> <name><surname>Kotb</surname> <given-names>M</given-names></name></person-group>. <article-title>Unsupervised pre-training of a deep LSTM-based stacked autoencoder for multivariate time series forecasting problems</article-title>. <source>Sci Rep</source>. (<year>2019</year>) <volume>9</volume>:<fpage>1</fpage>&#x02013;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1038/s41598-019-55320-6</pub-id><pub-id pub-id-type="pmid">31836728</pub-id></citation></ref>
<ref id="B39">
<label>39.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Suo</surname> <given-names>Y</given-names></name> <name><surname>Chen</surname> <given-names>W</given-names></name> <name><surname>Claramunt</surname> <given-names>C</given-names></name> <name><surname>Yang</surname> <given-names>S</given-names></name></person-group>. <article-title>A ship trajectory prediction framework based on a recurrent neural network</article-title>. <source>Sensors</source>. (<year>2020</year>) <volume>20</volume>:<fpage>5133</fpage>. <pub-id pub-id-type="doi">10.3390/s20185133</pub-id><pub-id pub-id-type="pmid">32916845</pub-id></citation></ref>
<ref id="B40">
<label>40.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sz&#x00142;apczy&#x00144;ski</surname> <given-names>R</given-names></name> <name><surname>Sz&#x00142;apczy&#x00144;ska</surname> <given-names>J</given-names></name></person-group>. <article-title>Heuristic method of safe Manoeuvre selection based on collision threat parameters areas</article-title>. <source>TransNav</source>. (<year>2017</year>) <volume>11</volume>:<fpage>591</fpage>&#x02013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.12716/1001.11.04.03</pub-id></citation>
</ref>
</ref-list> 
</back>
</article>