Predicting Motion Patterns Using Optimal Paths

The ability to navigate safely and efficiently through a given landscape is relevant for any intelligent moving object. Examples range from robotic science and traffic analysis, to the behavior within an ecosystem. Many objects tend to move in patterns depending on their nature. By establishing models of patterns of motion one may estimate the future motion within an area. We propose here a method for detecting regular patterns of motion by modeling the environment as an energy landscape, and locating optimal paths through it. As an example, we use maritime position Automatic Identification System (AIS) data as input to work out optimal routes between different start and end points when these are not located along the standard shipping lanes. These initial tests show that the method has potential for analyzing and determining regular patterns of motion.


INTRODUCTION
Imagine a large town square. The square has fixed structures such as fountains etc. that block direct pathways across it. It is a busy place where many pass across it in all directions. Cameras have been set up that record the motion of people across the town square. The question we will address in this paper is the following: Based on the recordings, is it possible to predict the motion of a single person from some start point A to some end point B across the town square, even when the points A and B are not located along the typical paths that people use across the square. This question has since long been posed in different contexts [1]. Examples range from the behavior of ecosystems to robotic navigation and traffic systems. A predator needs to account for the future motion of its prey in order to catch it, just as a ship needs to consider the future positions of other ships to avoid collisions [2].
Moving objects are influenced by both the landscape in which they move, as well as other objects, moving or not, within the same area [3][4][5]. There are several different ways to approach motion prediction. The most straight forward approach is to predict the motion of each object in a system individually, by assigning to each object a position as a function of time [6]. However, for large systems, this method would produce a large number of coupled equations. Hence, this approach would be unproductive in this case. A better approach is to exploit the fact that objects tend to move in patterns [3,7]. Depending on their nature and surroundings, moving objects tend to move in regular patterns. By establishing a model of these motion-patterns in a given area, one may use the pattern itself when predicting future motion. This is the core idea of our approach.
When applying methods of pattern recognition to motion prediction, the process typically operates in two stages. The first stage is the actual pattern recognizing, which learns the regular patterns of motion using a set of training data. The next stage applies the learned pattern to predict the future motion. Further, this two-stage process may be grouped into two main groups of techniques; Grid-based techniques and cluster-based techniques [3].
The grid-based techniques are derived from the occupancy grid concept [8]. That is, the landscape is modeled as a grid and transition probabilities between the cells are calculated from the training data. The grid is then used directly for motion prediction. Grid-based techniques are frequently used in robot navigation systems [9][10][11].
Cluster-based techniques on the other hand, apply statistical decision tools in order to group similar trajectories into representative clusters. Several different clustering techniques exists, the Expectation-Maximization approach [12] is considered to be the state of the art [3]. Future motion of a moving object is then estimated as the representative cluster which the given route is most likely to belong to.
In this paper, we propose a dynamic grid-based technique for learning motion patterns by mapping it onto the optimal paths in a disordered landscape problem [13,14]. We describe this problem as follows. Imagine a plane and that x → is a point on this plane. There is a stochastic field e( x → ) associated with the plane. We choose a path P through the plane starting at point x → A and ending at point x → B . We integrate the field e( x → ) along the path P, The optimal path is found by the minimization This problem has produced a large body of work within the statistical physics community. It is also closely related to the optimal path problem which is central in a large number of applications and fields [15][16][17][18].
The central idea we present in this paper is to relate the function e( x → ) to the inverse of the density earlier paths raised to some power. We then identify the optimal path from x → A to x → B through this landscape. We apply this idea to vessel traffic, using marine automatic identification system coordinates. We transform the coordinates to a dimensionless area and introduce a grid over the area. We associate each grid point with the local density of AIS coordinates. We implement the optimal paths through the area using the iterative algorithm of Hansen and Kertész [19,21], but any other algorithms may be used, e.g., the Bellman-Ford or the Dijkstra algorithms [22][23][24].
We emphasize that we are not attempting here to present a fully implementable algorithm ready to be used on ships. Rather, this is a feasibility study testing whether the central ideas may work.
We note that optimal paths have been used earlier in connection with marine motion prediction [20]. However, the paths in this case are optimized with respect to length. This is a very different concept than what we present here.
We organize this paper as follows. Section 2 describes the method we propose. In Section 3, we implement the method for marine AIS data. We end by a brief summary and discussion.

ALGORITHM
We now describe how we transform the AIS coordinates, given in terms of continuous longitude and latitude, into grid points. We then go on to describe the concept of optimal paths in this context and the algorithm used to extract it. Lastly, we describe how we turn this into path prediction.

From Automatic Identification System Coordinates to Grid
The automatic tracking system AIS uses tranceivers to allow ships to view surrounding marine traffic and to be seen themselves. It provides, among other services, a record of the position as a function of time for the equipped vessels passing through the area. This includes most large vessels.
We define our area of interest as the rectangle defined by the corners given by the longitudinal and latitudinal coordinates long min , long max , lae min and lae max . A given ship at a given time is at position (long k , lae k ) where the subscript refers to the position record (i.e., which ship and at what time). We introduce dimensionless Cartesian coordinates to describe its position, x k ∈ {0, N − 1} and y k ∈ {0, N − 1}, where N is an integer, given by and when K is the number of position recordings within the grid cell. We now associate a weight with node (i, j), where α is an adjustable parameter controlling the magnitude of the fluctuations of e i,j : if α → 0 the fluctuations are smoothened out and disappear when α 0 as all nodes then are assigned the same weight. The parameter m is chosen so that there is a balance between the tendency for a path to follow the normal shipping lanes (where ρ i,j is large) and a path being as short as possible. The other value M ≫ m ensures that no paths crosses land.
The last step in setting up the system is to assign weights to the links between neighboring nodes. Let the node (i nn , j nn ) be one of the four nearest neighbors of node (i, j). Then, the link between them is given the weight e i,j;inn,jnn 1 2 e i,j + e inn,jnn .
We also allow for diagonal paths. The link between node (i, j) and its diagonal neighbors (i nn , j nn ) as where the factor 2 √ is introduced to take into account the additional length of the diagonal edges.

Optimal Path Construction
We define a path P between two nodes A at (i A , j A ) and B at (i B , j B ) as a continuous chain of neighboring links linking the two nodes. We associate a weight of the path in the same way as in Eq. 1, The optimal path is then  Frontiers in Physics | www.frontiersin.org June 2021 | Volume 9 | Article 656296 3

Hansen and Fromreide
Motion Patterns Using Optimal Paths We will in the following assume that both nodes A and B lie on the edges of the grid. In order to identify the optimal path, we use the algorithm of Hansen and Kertész [19]. It consists of two main steps; first an initialization and then an updating process. A variable e i,j is assigned to each node. For the nodes on the edges of the grid, the values e i,j stay fixed, while it is updated for the internal nodes iteratively. The iteration algorithm for the internal nodes is e i,j → e i,j min inn,jnn e i,j;inn,jnn + e inn,jnn .
After M iterations, the variable e i,j will contain the sum of the weights along the optimal path starting at node (i, j) of length M. The end point of the optimal path is not specified. Furthermore, the optimal path may curl up on itself, creating a tadpole configuration.
Consider now a node (i A , j A ) on the boundary of the grid. In order to find the optimal path from an internal node (i, j) to (i A , j A ), we set the value e iA,jA to zero, while for the remaining boundary nodes the value e ic,jc is set to a very large positive value. The updating process for the internal nodes is carried out according to Eq. 13, until all values e i,j no longer change. At this point, the value of e i,j contains the value of E O E i,j;i A ,j A along the optimal path between nodes (i, j) and (i A , j A ).    We now choose another boundary node (i B , j B ) as end point for the optimal paths. Hence, we fix e iB,jB 0 and fix all the other boundary nodes to a large positive value, including boundary node (i A , j A ). The internal node values are initially set to zero, e i,j 0. We then iterate according to Eq. 13. When numbers no longer change, e i,j will contain E i,j;i B ,j B E O for the optimal path between nodes (i, j) and (i B , j B ).
We may now combine the optimal paths starting at boundary node (i A , j A ) and ending at internal node (i, j) with the optimal path starting at internal node (i, j) and ending at boundary node (i B , j B ). The optimal weight E O for this combined path is then given by, E iA,jA;i,j;iB,jB min inn,jnn E iA,jA;i,j + e i,j;inn,jnn + E inn,jnn;iB,jB , E iA,jA;inn,jnn + e inn,jnn;i,j + E i,j;iB,jB .

(14)
Associating each internal node (i, j) with the value e i,j E iA,jA;i,j;iB,jB leads to the construction of a pathscape [21]. The optimal path between edge nodes (i A , j A ) and (i B , j B ) is the sequence of nodes associated with the smallest e i,j values. There will then be a sequence of nodes having the second smallest values e i,j . This sequence will branch out from the globally optimal path as some node, to rejoin it at a different node along the path. Then there will be sequence with the third smallest values e i,j branching off and rejoining nodes belonging to the two paths containing the two smaller e i,j -and so on. Each internal node will belong to some path in this hierarchy.

Predicting Paths
We now focus our attention on pathscapes where boundary nodes (i A , j A ) and (i B , j B ) are placed along different edges. There are 6N 2 possible combinations. It may be convenient to coarse grain the end points of the optimal paths. Hence, we divide each edge into intervals of length L I . This means that we set the weight of all the edge nodes nodes (i ′ A , j ′ A ) belonging to the interval, e i A' ,j A' 0. The pathscape will then consist of all optimal paths starting somewhere in the first interval, (i ′ A , j ′ A ), passing through internal node (i, j) and then ending at a node (i B' , j B' ) somewhere in the end interval. Hence, the number of pathscapes is then reduced from 6N 2 to 6n 2 , where n N/L I . Suppose the optimal path (i A' , j A' ; i, j; i B' , j B' ) has a length L i A' ,j A' ;i,j;i B' ,j B' and a weight E i A' ,j A' ;i,j;i B' ,j B' . Using the weight alone in predicting paths does not function well since a short path through a high-weight region may be as optimal as a longer path through a low-weight region. We therefore renormalize the weights, where β is an adjustable parameter. The constant C p is introduced to further separate between different optimal paths. We note that higher β makes longer paths more favorable. The five most optimal paths between different edges. Each edge has been divided into two intervals, and fixing α 0.8, and β 0.8. Frontiers in Physics | www.frontiersin.org June 2021 | Volume 9 | Article 656296

ANALYSIS OF TWO AUTOMATIC IDENTIFICATION SYSTEM PATTERNS
We denote the two AIS sets we consider in the following A and B. Figures 1, 2 show the two areas and the subsets that we use in our analysis. Both subsets, shown in Figures 1B, 2B, have size 100 × 100 and where each grid block has size 100 × 100 m 2 (area A) and 1000 × 1000 m 2 (area B). We see that A has a simpler structure than B, consisting of two vertical clusters, while B includes multiple clusters with different orientation. By "cluster" we mean an area with high density of position recordings. We now consider area A. Setting parameter β 0 in Eq. 15, we show in Figure 3 the optimal path between the lower left and upper right corners of the grid 1b for different values of α.
We show in Figure 4 the length of the optimal path between the lower left and upper right corner in Figure 3 for different values of α and with β 0, L O . We note that L O is approximately linear in α for α < 0.75, at which there is a jump.
Turning to area B, we show in Figure 5 the optimal paths starting from the lower left corner and ending at the upper right corner of the grid shown in Figure 2B as a fuction of α while keeping β 0.
We show in Figure 6 the length of the optimal path between the lower left and upper right corner in Figure 5 for different values of α and β 0. As in Figure 4 for area A, we find a jump in the length of the optimal paths for a given value of α, here α ≈ 0.4. However, there are clearly defined plateaus in the optimal path length, e.g., for 0.05 < α < 0.4.
We now introduce intervals L I as described in Section C. We consider first a more coarse grained section of area A, covered by a grid of size 148 × 148 with grid size 100 × 100 m 2 . We divide each edge into two sections. Figure 7A shows the two most optimal paths in this system. In Figure 7B, we show the five most optimal paths. We see in Figure 7B that several of the five optimal paths overlap considerably, creating an impression that there are fewer paths in the figure than there is in reality. Figure 8 shows the ordered sequence of weights E' O for the optimal paths in Figure 7. We see that the weights of the first five paths is quite similar, whereas from the sixth and onwards, it is significantly higher. Generating this flat region is accomplished by adjusting β and it signifies that these optimal paths are equally good.
We do the same construction as in Figure 7 for area B in Figure 9. We divide the edges into three intervals and choose the values α 0.08 and β 0.8 for the two adjustable parameters. Figure 10 shows the ascending sequence of renormalized weights E' O for the two cases shown in Figures 9A,B. We find 48 optimal paths with slowly increasing weights before it jumps to a much higher value. Figure 9 shows the same area as in Figure 5. We have here divided the edges into three intervals. The weight of the 50 most optimal paths is shown if Figure 10. As in the much simpler pucture in Figure 7, there is also in this case considerable overlap between the optimal paths. Here the grid has been divided into three segments along each edge. In (A) we show the 24 most optimal paths and in (B) we show the 48 most optimal paths. We fixed the parameter values to α 0.08 and β 0.8. We have in this paper introduced a method to predict motion in an area based on earlier motion in the same region. That is, given the history of traffic in the area, what would be the most likely path a new traveler would take between positions A and B, even if these positions are outside the usual routes of travel in the area. The method is based on the concept of optimal paths through a landscape formed by the paths taken earlier. It is a dynamic method as each new trajectory taken in the area is added to the history.
We have tested the method on marine Automatic Identification System (AIS) data. From a visual point of view, the method locates the motion patterns efficiently in both the simple case we studied (A) and in the more complex traffic picture (B). However, a proper performance test has not been performed. Further, the results showed that sectioning the edges into only a few intervals, were enough to generate a good estimate of the pattern. As all the grid nodes along the edges, may be represented by only a few intervals, with a short running time of the algorithm. The grid dimension and size of the cells does not seem to influence the results that are found.
This work shows that the method we propose manages to identify sensible paths that optimize between path length and frequency of use-two seemingly very different quantities. In order to turn this into a practical method, much more work is needed, e.g. with respect to the cluster identification, grid construction, type of vessel (if it is to be implemented as a marine tool).

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.