Bounded Cost Path Planning for Underwater Vehicles Assisted by a Time-Invariant Partitioned Flow Field Model

A bounded cost path planning method is developed for underwater vehicles assisted by a data-driven flow modeling method. The modeled flow field is partitioned as a set of cells of piece-wise constant flow speed. A flow partition algorithm and a parameter estimation algorithm are proposed to learn the flow field structure and parameters with justified convergence. A bounded cost path planning algorithm is developed taking advantage of the partitioned flow model. An extended potential search method is proposed to determine the sequence of partitions that the optimal path crosses. The optimal path within each partition is then determined by solving a constrained optimization problem. Theoretical justification is provided for the proposed extended potential search method generating the optimal solution. The path planned has the highest probability to satisfy the bounded cost constraint. The performance of the algorithms is demonstrated with experimental and simulation results, which show that the proposed method is more computationally efficient than some of the existing methods.


INTRODUCTION
Over the last few decades, autonomous underwater vehicles (AUVs) have been employed for ocean sampling (Leonard et al., 2010;Smith et al., 2010), surveillance and inspection (Ozog et al., 2016;Xiang et al., 2010), and many other applications. Ocean flow is the dominant factor that affects the motion of AUVs . Ocean flow dynamics vary in both space and time, and can be represented as geophysical Partial Differential Equations (PDEs) in ocean circulation models (e.g., the Regional Ocean Modeling System (ROMS, Shchepetkin and McWilliams, 2005;Haidvogel et al., 2008) and the Hybrid Coordinate Ocean Model (HYCOM, Chassignet et al., 2007)). While these models can provide flow information over a large spatial domain and forecast over several days, the available flow field forecast may still contain high uncertainty and error. The uncertainty comes from multiple sources, including the incomplete physics or boundary conditions (Haza et al., 2007;Griffa et al., 2004) and even terms in the equations themselves (Lermusiaux, 2006). In addition, the high complexity of the flow dynamics makes solving these PDEs computationally expensive. Data-driven flow models (Mokhasi et al., 2009;Chang et al., 2014) can provide short-term flow prediction in a relatively smaller area with significantly lower computational cost, and can be more suitable for supporting real-time AUV navigation, particularly for systems with strong gradients and/or high uncertainty.
Path planning is one of the crucial and fundamental functions to achieve autonomy. Two key considerations for an AUV path planner are computational efficiency and path quality. The path planning strategy should be computationally efficient so that the time for generating a path can be kept to a minimum. When these methods are sufficiently fast, path planning can be performed in near-real time, generating a feasible solution in minutes while the AUV has surfaced to get a GPS fix and communicate with shore. Advantages of real time path planning are that more recent information, including the real time data from the vehicle, can be incorporated in path planning. Hence there will be less planning error due to outdated information (Leonard et al., 2010). At the same time, it is desired that the path planning algorithm has theoretical guarantee on the quality of the generated path.
Most path planning algorithms aim to design optimal path minimizing certain cost, for example, those associated with engineering or flight characteristics (battery life, travel time) or scientific value (e.g., distance relative to other assets or spacing of relevant processes). Algorithms that have been applied to AUV optimal path planning include: 1) graph-based methods such as the A* method (Rhoads et al., 2012;Pereira et al., 2013;Kularatne et al., 2017Kularatne et al., , 2018 and the Sliding Wavefront Expansion (SWE) (Soulignac, 2011); 2) sampling-based methods like the Rapidly exploring Random Trees (RRTs) (Kuffner and LaValle, 2000;Cui et al., 2015), RRT* (Karaman and Frazzoli, 2011) and informed RRT* (Gammell et al., 2018); 3) methods that approximate the solution of HJ (Hamilton-Jacobi) equations, such as the Level Set Method (LSM) (Subramani and Lermusiaux, 2016;Lolla et al., 2014), and 4) the evolutionary algorithms, including the particle swarm optimization methods (Roberge et al., 2012;Zeng et al., 2014), and the differential evolution methods (Zamuda and Sosa, 2014;Zamuda et al., 2016). See (Zamuda and Sosa, 2019;Zeng et al., 2015;Panda et al., 2020) for a comprehensive review on the existing AUV path planning methods. However, the computational cost of the above mentioned methods could be high, especially in cases where the AUV deployment domain is large.
Using a regular grid to discretize the flow field can result in unnecessary large number of cells, which increases the computational burden of the graph search methods. Since the flow speed in adjacent cells is usually similar, we partition the flow field into piece-wise constant subfields, within each the flow speed is a constant vector, and introduce the Method of Evolving Junctions (MEJ) (Zhai et al., 2020) to solve the optimal path planning problem. MEJ solves for the optimal path by recasting the infinite dimensional path planning problem into a finite dimensional optimization problem through introducing a number of junction points, defined as the intersection between the path and the region boundaries. Hence the computation cost of MEJ is significantly lower than other optimal path planning methods, especially when the flow field is partitioned into a small number of cells (Zhai et al., 2020). We identify Soulignac. (2011) as the work closest related to ours, in which sliders, defined as points sliding on the partitioned region boundaries, are introduced to describe the wavefront expansion of the graph search methods. In each iteration of the wavefront expansion, each slider's position on the wavefront is derived by minimizing the travel cost in a single cell, and then the planned path is computed by the backtracking of wavefronts. Both MEJ and SWE are based on a novel parameterization of the path by introducing the junctions and the sliders, that were discovered independently by the two research groups. The main difference between MEJ and SWE lies in that MEJ solves for junction positions by formulating a non-convex optimization problem, and derives the global minimizer by intermittent diffusion (Li et al., 2017), which intermittently adds white noise to the gradient flow, while SWE solves for slider positions by graph search methods. MEJ has been justified to find the global minimizer with probability 1. However, since the method does not pose any structure in the search, the computational cost of MEJ could be less favorable compared to SWE if the number of cells scales up.
To reduce the computational cost of the path planning problem, the search can be reduced to find paths with total cost less than an upper bound. Stern et al. (2014) present two algorithms to solve the bounded cost search problem: the Potential Search (PTS) and the Anytime Potential Search (APTS). The PTS method defines the Potential Ordering Function, which is an implicit evaluation of the probability that a node is on a path satisfying the bounded cost constraint, and iteratively expands the nodes in the graph with the highest Potential Ordering Function value. The wavefront expansion terminates when the goal node has been expanded, and the path is found by backtracking of the wavefronts. The APTS method runs the PTS algorithm iteratively to improve on the incumbent solution, with the upper bound on total cost lowered in each iteration of the algorithm. Later work (Thayer et al., 2012) improves on the PTS method by minimizing both the potential and an estimation of the remaining search effort, so that the bounded cost search problem will be solved faster.
In this paper, our first objective is to develop a data-driven computational flow model that approximates the true flow field in the region of interest to assist AUV path planning. The proposed data-driven flow model divides the flow field into cells, within which the flow is represented as a single flow vector. The optimal flow cell partition and initial values of the flow vectors in each cell are derived from prior flow information, from numerical ocean models or from observations. To improve model accuracy, AUV observational data can be incorporated into the data-driven model in near-real time, for example, in the form of observed or estimated velocities (Chang et al., 2015). Here, we design a learning algorithm that estimates the flow field parameters based on the AUV path data. Our second objective is to develop an algorithm that solves the AUV bounded cost path planning problem. Given that the vehicle is traveling in a flow field represented by the data-driven computational model, the goal is to design a path that connects AUV initial position with goal position with the highest probability to have travel cost less than a preassigned upper bound. By introducing the key function, which is an implicit evaluation function of the probability that a path satisfies the bounded cost constraint, the optimal path is computed by searching for the nodes with lowest key function value using an informed graph search method.
The main novelty of this work is introducing the modified PTS method to solve the bounded cost search problem. Unlike the PTS method (Stern et al., 2014), which assumes that the branch cost of the graph is known exactly, our method deals with problems where the branch cost of the graph is uncertain. Given assumptions on the distribution of cost-of-arrival and cost-togo, we prove that the proposed algorithm guarantees optimality of the planned path, that is, the planned path has the highest probability of satisfying the bounded cost constraint. To the best of our knowledge, this is the first time that the optimality of the modified PTS solution to bounded cost problems is proved. The proposed bounded cost path planning method can be viewed as an extension to MEJ. Compared to MEJ, the modified PTS algorithm is computationally more efficient, since a graph search method is adopted to search for the junction positions. At the same time, optimality of the planned path can be theoretically justified. The major benefit of the proposed bounded cost path planning algorithm lies in that it plans a path faster, while at the same time still guarantees the path quality. This paper is a significant extension of the conference proceeding (Hou et al., 2019), which proposes a flow partition method that approximates the flow field by a set of cells of uniform flow speed. The main extensions of this paper are that taking advantage of the flow model proposed in (Hou et al., 2019), we propose the modified PTS method to solve the AUV bounded cost path planning problem, and present theoretical justification on the optimality of the proposed PTS method. The proposed bounded cost search method is potentially applicable for all bounded cost path planning problems with uncertain branch cost.
We believe the proposed data-driven flow modeling and bounded cost path planning methods are well-suited for path planning of underwater glider deployment near Cape Hatteras, NC, a highly dynamic region characterized by confluent western boundary currents and convergence in the adjacent shelf and slope waters. While deployed, the gliders are subject to rich and complex current fields driven by a combination and interaction of Gulf Stream, wind, and buoyancy forcing, with significant crossshelf exchange on small spatial and temporal scales (Savidge et al., 2013a, Savidge et al., 2013b that would be highly difficult to sample using traditional methods. Path planning must consider spatial variability of the flow field. Because spatial gradients are significant, real-time path planning is critical to take advantage of real-time data streams. Through simulated experiments, we demonstrate the performance of applying the proposed algorithms to underwater glider deployment in this area, and show that the proposed algorithm is more computationally efficient than A* and LSM.

Vehicle Dynamics
Let F R : D → R 2 represent a spatially distributed vector field for the ambient flow velocity, where D ∈ R 2 is the domain of interest.
Let [T 0 , T f ] be the AUV deployment time interval. The AUV model is described as where x ∈ D denotes vehicle position. V R is the through-water speed of the vehicle, and Ψ C (t) [cosψ C , sinψ C ] T is a unit vector that represents the direction of the vehicle motion along heading angle ψ C . Assumption 2.1: During the operation, V R is an unknown constant.
Remark 2.1: Actual vehicle speed may depend on a number of factors that affect an AUV's speed, including water depth, efficiency of propulsion, and bio-fouling. These effects are difficult to estimate. Hence the vehicle forward speed is assumed to be an unknown constant.
Assumption 2.2: We assume that the heading Ψ C (t) can be controlled for all time t, and the vehicle trajectory x(t) can be measured or estimated for all time.
Remark 2.2: Though the actual location of a vehicle may only be known occasionally when the vehicle is underwater, the trajectory of the vehicle can be estimated through localization algorithms, which incorporates the known locations and the heading angle commands as inputs to generate the optimal state estimation. Assumption 2.3: We assume the flow field is time-invariant throughout the deployment. Remark 2.3: Even though there are existing work that considers the time-variant flow field in solving the AUV planning problem, such as (Eichhorn, 2013;Lolla et al., 2014;Zamuda and Sosa, 2014), we make this assumption due to the patterns of the flow field in this domain. In the domain of interest considered in this paper, which is near Cape Hatteras, NC, the current field is driven by a combination and interaction of Gulf Stream, wind, and buoyancy forcing (Savidge et al., 2013a, Savidge et al., 2013b. Because magnitude and spatial gradients of the flow field are significant relative to the temporal variation of the flow field (mostly the tidal flow component), time variation of the flow does not have a significant influence over the planned path.

Data-Driven Flow Modeling
Flow speed at neighboring grid points often exhibits similarity in both strength and direction. Hence we assume that at the time scale of an AUV deployment, the flow field can be divided into finite number of regions {R i } i∈I R , with the union of all cells being the domain, ∪ i∈I R {R i } D. The regions are separated by continuous boundary curves. boundary curves {f i,j } i,j∈I R , and f ij (x) 0 is the one dimensional compact boundary of the region R i and R j . We define an indicator function ϕ i (x) as follows: This function indicates whether x is in R α . Let ϕ : R 2 → R N be defined as ϕ(x) ϕ 1 (x) . . . ϕ N (x) T . Then ϕ(x) are a set of spatial basis functions of D.
In order to compute the partition, which is represented by the basis functions ϕ. We need to use prior information of the environment obtained either from forecast data of the existing ocean models, or from historical datasets. Let F 0 : D × [T 0 , T f ] → R 2 denote the discretized flow map forecast available on a set of grid points in D, and let y ∈ R 4 , y [x T , F 0 (x, t) T ] T denote the vector at position x and time t. We can define a distance function as dist : R 4 × R 4 → R as dist 2 (y, y ′ ) (y − y ′ ) T Q(y − y ′ ), where y, y′∈ R 4 , Q is a weight matrix. For each cell R i , let r i represents its center, and let ] i be the flow vector in this cell. Our goal is to find the optimal values of ϕ(x) and {] i } i∈I R by solving the following optimization problem: T denoting the flow vector in partitioned region R α . Then the partitioned flow field can be represented as To estimate the true flow field F R , we use the AUV path data x(t). Let be our estimate of the parameter θ and V L (t) be our estimate for V R . We will design a learning algorithm to achieve c convergence of ξ(t) and V L (t) to the true values e.g., ξ(t) → θ and V L (t) → V R as t → ∞.

Bounded Cost Path Planning
Our goal is to find a path connecting the vehicle current position x 0 to the final position x f that results in total travel time less than an upper bound C. In practice, the planning and replanning process of AUV happen over long intervals, in order to avoid increased computation cost. Hence we assume that the estimated parameters have converged to their true value counterparts before the planning process. There may be more than one path satisfying the bounded cost constraint. Thus we formulate the following optimization problem, in which the decision variable is the vehicle's heading angle, ψ C (t). The optimization problem is to find the decision variable that is most likely to satisfy the bounded cost constraint: where the total travel time to start from the initial position x 0 to reach the destination position x f under the control signal ψ C (t) is denoted as T(ψ C (t)).

FLOW FIELD ESTIMATION
First let us describe Algorithm 1. We derive the spatial basis function and the initialized flow model parameters by solving eq. 2 using the K-means algorithm. Since this optimization depends on both spatial and temporal variance of the flow field, solving this problem can be computationally expensive. To simplify this problem, instead of optimizing the difference between the timevarying flow forecast and the partitioned flow field, as described in eq. 2, we optimize the difference between the time-averaged flow forecast and the partitioned flow field: To implement the K-means algorithm, we start by randomly selecting k cell centroids, and then use Lloyd iterations to solve the optimization problem. The Lloyd iteration contains two steps, first assign the points that are closest to a centroid to that centroid, and then recompute the cell centroid. These two steps are repeated until cell membership no longer changes. The K-means algorithm requires proper selection of the number of partitioned cells, k, the choice of which affects the path planning performance and flow modeling quality. If the field is divided into too many regions, then it results in a complicated flow structure and potentially increases the computational cost of path planning. On the other hand, dividing the field into too few regions may result in a large error between the true flow field and the modeled flow field, which potentially leads to significant path planning error. Therefore, we introduce an iterative K-means algorithm that can guarantee a bounded flow field partition error, and at the same time utilize the smallest number of partition regions.
Let F 0 : D → R 2 denote the time-averaged flow field over the time interval [T 0 , T f ], and ] i is the uniform flow velocity in R i . Define the flow field partition error as: Given an initialized k, we will iteratively perform the K-means algorithm (lines 8, 9), and check if the flow partition error satisfies δF < ε, where ϵ is a pre-defined upper bound on the flow partitioning error. If this condition is satisfied, then the current k is designated for partitioning; otherwise, the number of cells is increased by 1, and we recompute the solution to eq. 5 using K-means method. Now let us explain Algorithm 2. An estimate z(t) for the vehicle trajectory can be computed by integrating where β(t) ∈ R 2 is introduced as a learning injection parameter.
The term e x − z is the controlled Lagrangian Localization error (CLLE, Cho and Zhang, 2016;Cho et al., 2021), which describes how much the actual trajectory deviates from the estimated trajectory. A learning algorithm will then compute β(t), ξ(t), V L (t) so that the CLLE can be reduced. The CLLE dynamics can be derived from eqs 7, 1.
Frontiers in Robotics and AI | www.frontiersin.org July 2021 | Volume 8 | Article 575267 We design the learning parameter injection as and hence the CLLE dynamics becomes The learning algorithm updates parameters ξ(t) and V L (t) so that the CLLE converges to zero. Let where ⊗ is the Kronecker product. We design the updating rules for parameter estimation as follows, These rules are then used in Algorithm 2.

BOUNDED COST PATH PLANNING
Given the piece-wise constant flow model described in eq. 3, the domain is divided into a finite number of regions {R i } i∈I R . Thus, all possible trajectories cross a sequence of cells of uniform flow, and finally reach the goal position. Since the vehicle moves in constant speed, and the flow in one cell is uniform and constant, the vehicle's optimal heading angle in each cell should be constant, and the vehicle path in each cell is a straight line. We define junction points as the position where the path intersects with cell boundaries. Below we show that in each cell, due to the time invariance of the flow field, solving for the heading angle is equivalent to solving for the junction points of a path. Let c 1 , c 2 denote two junction points on two different boundary curves of the same cell R i . Since the vehicle moves at constant speed, the total vehicle speed V R Ψ C + θ i must be in the same direction as the segment of the path, From eq. 12, we can represent the vehicle's heading angle as a function of the junction points, The vehicle's travel time in R i , denoted as τ, can be computed given the junction points and the vehicle's heading angle, Combining eq. 14 and eq. 12 we can write the travel time in cell R i as a function dependent only on the junction points, eq. 15 describes the travel time in one single cell. Based on the discussion of the one cell case, next we talk about solving eq. 4 across multiple cells. Let c 1 , c 2 , . . . , c n denote the chain of junctions position, and p 1 , p 2 , . . . , p n+1 represent the index of the sequence of cells that the path crosses. The planning problem eq. 4 can be transformed into the following mixed integer optimization problem, where the decision variables are the cell sequence and the junctions position, Now let us explain the proposed solution to eq. 16. The solution is presented in Algorithm 3. We propose a bounded cost path planning algorithm that solves for the path that is most likely to satisfy the bounded cost constraint. The solution contains two steps, the first step is solving for the optimal sequence of cells that is most likely to result in a bounded cost path, in the discretized flow map described by the piece-wise constant flow cells. In this step, the junction positions are unknown. An informed graph Update β(t) using eq. 9 Update ξ and V L (t) using eq. 11 Update z(t) by integrating eq. 7 end ALGORITHM 1 | Flow Field Partition Algorithm Compute δF using eq. 6 end end Frontiers in Robotics and AI | www.frontiersin.org July 2021 | Volume 8 | Article 575267 search method is used for the first step of the proposed solution. The second step is optimizing junction positions on the boundaries of the optimal cell sequence. The proposed solution is presented in Algorithm 3. First we describe the first step of our solution. Consider having a candidate junction on boundaries of all cells. Two junctions c i and c j are defined as adjacent if indicating that the two junctions are on different boundaries of the same cell. Two adjacent junctions are connected by an edge. A non-directed graph G can be formed with the vertices being all the candidate junctions in the domain, and the edges being the path segment between the adjacent candidate junctions. Let n i,j describe the node that corresponds to the junction on boundary curve between R i and R j , and let s and g denote the node corresponding to the starting and final position.
Then in this context, a path Γ from the starting position x 0 to the final position x f , crossing the cells R p 1 , . . . , R p n+1 in sequence can be represented by a sequence of nodes s → n p1p2 → n p2p3 → . . . → n pnpn+1 → g on the graph. Figure 1 is an example of the graph representation of the workspace, in which case the flow field is partitioned into 4 cells.
The branch cost of the graph is defined as the travel time from one junction to another adjacent junction. The travel time can be computed by eq. 15 if the two junction positions are known. However, since the junction positions are unknown when optimizing the cell sequence, the branch cost of the graph cannot be explicitly computed. Hence we introduced the following assumption: Assumption 4.1: We assume that the branch cost for all edges in E is a random variable, with a known minimum value.
Remark 4.1: Even though the branch cost is unknown, its minimum value can be computed, since the branch cost eq. 15 is convex with respect to c 2 − c 1 (Soulignac, 2011). We solve for the minimum cost of all edges in the graph, denoted as w p ij,jk by solving the following constrained optimization problem, using the interior-point method. (Kim et al., 2007), The informed graph search method we propose is an extension to a class of graph search algorithms called potential search (PTS) (Stern et al., 2014). The PTS algorithms can be viewed as modifications to the celebrated A* algorithm for path planning (Hart et al., 1968). To determine which nodes should be searched, the algorithms maintain an OPEN list and a CLOSED list. A graph node is labeled NEW if it has not been searched by the algorithm. The OPEN list contains all the nodes that are searched, but still have a NEW neighbor. The CLOSED list consists of all the nodes that have been accessed by the search algorithm.
To determine which cells should be searched first by the algorithm, the algorithm computes the cost-of-arrival, which is the minimal cost of going from the starting node s to an arbitrary node n, and cost-to-go, which is the minimal cost of going from n to the goal point g. Let g p (n) denote the actual cost-of-arrival, and let h p (n) denote the actual costto-go of a node n. Since the actual cost-to-go is unknown during the search, a heuristic cost h(n) ≤ h p (n), is usually used by the search algorithm. The A* search algorithm sort the OPEN list according to the value of g p (n) + h(n). The node with lowest value is searched first. In our problem, the following estimated cost-to-go is used to guide the search: The heuristic function defined in eq. 18 is the travel time of the vehicle traveling in the most favorable flow condition, reaching goal position from the junction position that is closest to the goal. Hence, h(n) ≤ h p (n), which is required by A* search. However, in our problem, the branch cost is unknown. The exact value of

ALGORITHM 3 | Bounded Cost Search in Piece-wise Constant Flow Field
Data: Start and goal node s, g, start and goal position x 0 , x f , travel cost upper bound C, graph G Output: Optimal heading angle Ψ C PTS (s, g, C, G) → P *Step 1: Find the optimal cell sequence MEJ (P, x 0 , x f ) → Ψ C *Step 2: Find the optimal heading angle Function PTS a(s, g, C, G) Initialization. g(n) ∞, h(n) ∞, K(n) ∞, ∀n ∈ G s → {CLOSED} Set the heuristics and estimated cost-of-arrival of s Initialize junction set c 0 on the boundary curves of the cell sequence P Compute junction positions by solving eq. 20 Compute heading angle from junction positions by eq. 21 return Ψ C (t) Frontiers in Robotics and AI | www.frontiersin.org July 2021 | Volume 8 | Article 575267 actual cost-of-arrival cannot be computed during the search process, which is different from a typical path planning problem that can be solved by A* or PTS method. Hence we introduce the estimated cost-of-arrival, denoted as g(n). The estimated cost-of-arrival is computed by summation of the minimum branch cost along the path, thus g(n) ≤ g p (n).
The goal of A* search is to find the path with minimum cost. In our problem, due to the uncertainties in the branch cost, this goal is overly ambitious. Hence our problem formulation eq. 16 aims to find a path with bounded cost. We define a potential function described as follows: Definition 4.1: The potential of a node n, denoted as PT(n), is Pr(h p (n) + g p (n) < C).
The potential function characterizes the probability that a node is on a path that satisfies the bounded cost constraint. Nodes with high potential have higher probability to be part of the desired path. However, the exact potential of nodes cannot be computed or compared, since both h p (n) is unknown before the optimal path is found. Therefore, PTS algorithms usually design a key function to determine the nodes that need to be searched at each step of the graph search. Nodes in the OPEN list are sorted by the key function value instead of g p (n) + h(n), which is the main difference between the PTS algorithms and the A* method.
Various key functions have been proposed for different path planning problems with bounded cost (Thayer et al., 2012;Stern et al., 2014, Stern et al., 2011. One significant contribution of this paper is in extending the PTS method to solving bounded cost problems with uncertain branch cost, by introducing a new form of key function K(n) ∈ R ≥ 0 as K(n) is an indication of the probability of the node n being on a path that satisfies the bounded cost constraint. Nodes with lower key function value have a higher probability of being on a path satisfying the bounded cost constraint. The intuition is that, if h(n) + g(n) < C, the key function value increases if either h(n) or g(n) is larger. In this case, the estimated cost h(n) + g(n) increases, and will be closer to C, then it is less likely that the true cost satisfies the bounded cost constraint, and the node n is less likely to be on a feasible path. If h(n) + g(n) ≥ C, then n cannot be on a path satisfying the bounded cost constraint, since h p (n) + g p (n) ≥ h(n) + g(n) ≥ C. In this case, the key function is set as positive infinity.
The PTS with our new key function is then applied to search for the optimal cell in Algorithm 3. The only difference between our PTS and A* is that the total cost used by A* to sort the OPEN list is replaced by the key function, as shown in line 13. Similar to A*, The search algorithm consists of two processes: the expansion process and the backtracking process. During the iterative expansion process, the algorithm orders the nodes in the OPEN set according to the key function value, and inserts the node with the lowest key function value to the CLOSED set (lines 19, 20). Neighbors of this node and their key function values are updated if the neighboring nodes can be reached with a lower cost through the current node (lines 26, 28, 29). The propagation continues until the OPEN list is depleted, or the goal node is in the OPEN set. Starting from the goal position, the backtracking process searches for the predecessor of the last node in the path set and add it to the path, until the starting node is included in the path (lines 37, 38).
The PTS algorithm fulfills step one of the bounded cost path planning solution. We have found the vector of indices {p i } n+1 i 1 , which indicates the optimal indices that is most likely to result in a bounded cost path. In step two, we find the optimal junction positions that leads to the minimum total cost. Given the optimal cell sequence, the problem eq. 16 converts to an optimization problem depending on the junction positions {c i } n i 1 in all cells, This optimization problem is solved by the interior-point method.
The optimal heading angle can be computed from the junction Frontiers in Robotics and AI | www.frontiersin.org July 2021 | Volume 8 | Article 575267 positions using eq. 12. In each cell of the sequence {p i } n i 1 , given the optimal junction position c i+1 and c i , the heading angle in cell R p i can be derived by

THEORETICAL JUSTIFICATION
In this section, we give theoretical justification to the proposed datadriven flow modeling and bounded cost path planning method.

Data-Driven Flow Modeling
Algorithm 1 can be theoretically justified by proving that the optimal solution to eq. 5 is the optimal solution to eq. 2. We also prove that Algorithm 2 achieves error convergence and parameter convergence, indicating that the estimated trajectory converges to the actual trajectory, and the estimated parameter converges to the true values. Lemma 5.1: The optimal flow partition derived by solving eq. 2 is the same as the optimal flow partition derived from eq. 5.
The second term in J represents the temporal variation of flow speed on one grid point, which does not change with respect to the partitioning of the flow field. Hence arg min where J′ is defined in eq. 5. Thus the optimal solution of eq. 2 equals the optimal solution of eq. 5, which implies that the optimal flow partition of the time-varying flow field is equivalent to the optimal flow partition of the timeinvariant flow field, computed by taking the time-average of the flow field observations, as described in line 1, Algorithm 1.
Next we will prove that under Assumption 2.3, the estimated trajectory converges to the actual trajectory, and that the estimated parameter converges to the true value using adaptive control theory. In order to prove convergence, the persistent excitation condition must be demonstrated, and is given below.
, which is the input signal to eq. 24. We can construct a matrix W(t) ∈ R (2N+1)×(2N+1) as follows The persistent excitation condition is critical to prove the convergence of parameters (Narendra and Annaswamy, 1989). When W(t) is singular the estimation errors of parameters may not converge to zero. The persistent excitation condition requires that the trajectories traveled by the robot to spread over all the partitioned cells, as stated in the following Lemma. Lemma 5.3: The signal vector w is persistently exciting if the vehicle visits all the partitioned cells.
PROOF: Since the partitioned cells do not overlap with each other, for ∀τ, x(τ) can only be in one cell. Hence for all i, j ∈ I R , Thus W(t) can be simplified to If ∀i, ∃τ ∈ [t, t + T], s.t. ϕ i (x(τ)) 1, meaning that the vehicle visits all cell during time [t, t + T], then W(t) is full rank, and hence w is persistently exciting. The persistent excitation condition must be satisfied in order to have the flow parameters of all the cells and vehicle speed estimation converge to the true value. The persistent excitation condition requires the vehicle to visit all the partitioned regions. If this condition is not satisfied, not all flow parameters in the partitioned cells can be accurately estimated. We will further address this condition in the simulation and experimental result section.The convergence of CLLE is presented as follows.
Theorem 5.4: Under the updating law eq. 11, CLLE converges to zero when time goes to infinity. PROOF: Consider the following Lyapunov function, Since e T (θ − ξ(t))ϕ(x) (θ − ξ(t))e ⊗ ϕ(x), the derivative of V is Frontiers in Robotics and AI | www.frontiersin.org July 2021 | Volume 8 | Article 575267 _ V is negative semi-definite, which implies that e, ξ, V L (t) are bounded. In addition, the second order time derivative of V is Thus € V bounded, and hence _ V is uniformly continuous. Therefore lim t → ∞ _ V(t) 0. Since K is a diagonal matrix, e(t) → 0 as t → ∞.
Theorem 5.5: Under the updating law eq. 11, if the vehicle visits all the partitioned cells, ξ(t) and V L (t) converges to θ and V R respectively as time goes to infinity.
PROOF: Let η 1 θ 1 − ξ 1 (t), η 2 θ 2 − ξ 2 (t), η 3 V R − V L (t), then the CLLE dynamics can be written as We define a new state variable X [e T , η T 1 , η T 2 , η 3 ] T , and an output variable Y e, then the dynamics for the state variable and the output variable satisfy Our goal is to show that the origin of _ X A(t)X is uniformly asymptotically stable, which indicates that ξ converges to θ, and V L (t) converges to V R . Let There exists some c 1 , c 2 such that c 1 I ≤ P ≤ c 2 I, and there exists some constant 0 < ] < 1 such that which is negative semi-definite. Then by the Lyapunov theorem (Theorem 3.8.4 in (Ioannou and Sun, 1995)), _ X A(t)X is uniformly asymptotically stable if we can prove that (C, A) is uniformly completely observable. First, we will find a bounded matrix L, and show that (C, A + LC) is uniformly completely observable. Then, this will lead to the conclusion that (C, A) is uniformly Since Ψ C is uniformly bounded, and all elements inφ is either 0 or 1, L is uniformly bounded, and Thus, we now consider the observability of Let η [η 1 , η 2 , η 3 ] T , then the system eq. 23 has the following form: Due to the assumption that the vehicle visits all cells, by Lemma 5.3, w is persistently exciting. Let Φ(τ) τ t e −K(τ−σ) w(σ)dσ be the output of eq. 24 given input w. Then Φ(τ) satisfies persistently exciting conditions because w(σ) is persistently exciting, and the transfer function of eq. 24, (sI + K) −1 is stable, minimum phase, proper rational transfer function. Therefore, there exists constant By applying Lemma 4.8.4 in (Ioannou and Sun, 1995), we can conclude that the system of eq. 24 is uniformly completely observable. In other words, we have proved that (C, A + LC) is uniformly completely observable. By applying Lemma 4.8.1 in (Ioannou and Sun, 1995) the system (C, A) is uniformly completely observable. Therefore, the origin of _ X AX is uniformly asymptotically stable, that is, X → 0 as t → ∞. This means that η 1 , η 2 and η 3 go to zeros, individually. Thus ξ and V L (t) converges to θ and V R , respectively.

Bounded Cost Path Planning
In this subsection, we prove that Algorithm 3 finds the optimal solution of eq. 4. Assumption 5.1 and Assumption 5.2 are required for the optimality proof.
Assumption 5.1: Consider any node not the starting node s or the goal node g, the estimated cost-of-arrival g(n) and the estimated cost-to-go h(n) are bounded below, g(n) ≥ g min , h(n) ≥ h min , where g min > 0 and h min > 0.
Remark 5.1: For any node that is not the goal node, h(n) reaches its minimum when n is an adjacent node of g. Similarly, for any node that is not the start node, g(n) reaches its minimum when n is adjacent to s. Since the flow partition algorithm is performed over discrete grid points in D, size of the cells cannot be infinitely small. Therefore, h(n) can only be zero if the junction represented by the node n is sliding on the same boundary of the goal point, and g(n) can only be zero if the junction represented by the node n is sliding on the same boundary of the start point. However, by junction assignment, only one junction can be assigned on each boundary. Hence there exists h min and g min that bound h(n) and g(n) from below, and the lower bound cannot be infinitely small.
Let H max max C h min , C g min . Consider {X n } N n 1 , {Y n } N n 1 to denote sequences of independent and identically distributed random variables uniformly distributed over [1, H max ]. To prove optimality of the algorithm, we make assumptions on the statistical relationship between h p (n), h(n), and g p (n), g(n) as follows.
Assumption 5.2: The true cost-to-go, h p (n) and the heuristic function h(n), as well as the true cost-of-arrival, g p (n) and estimated cost-of-arrival, g(n) both satisfy h p (n) h(n)Y n , g p (x) g(n)X n .
Remark 5.2: Both h p (n) and g p (n) are summation of branch cost along the optimal path. Since the branch cost is the travel time between two adjacent junctions sliding on two boundaries, the branch cost of all edges must have both a lower bound and an upper bound. Hence both h p (n) and g p (n) are assumed to be a uniform distribution, with the minimum of it being h(n) and g(n), and the maximum being h(n)H max and g(n)H max . In practice, the statistical model of h p (n) and g p (n) depends on the distribution of the flow field, and may not be uniform distribution in some flow cases. However, the following theoretical analysis can be adapted to other parameterization of the statistical model of h p (n) and g p (n).
We will show below that, by expanding the nodes with the lowest key function value without explicitly calculating the potential of nodes, the proposed algorithm expands the nodes with the highest potential value, and thus guarantees to find the optimal solution to eq. 4. Lemma 5.6 states that the key function is an equivalent evaluation of the potential value of nodes. Lemma 5.7 demonstrates that the optimal path can be equally defined by either the potential or the key function value of nodes. Finally, given the two Lemmas, we justify the optimality of the proposed algorithm, which is stated in Theorem 5.8.
Lemma 5.6: For all n 1 , n 2 ∈ G, PT(n 1 ) < PT(n 2 ) if and only if K(n 1 ) > K(n 2 ). PROOF: To simplify notation, let h 1 , h * 1 , g 1 , g * 1 denote h(n 1 ), h * (n 1 ), g(n 1 ), and g * (n 1 ), respectively. The Lemma trivially holds in the cases where either K(n 1 ) or K(n 2 ) is infinity. Below we show that the Lemma holds in the case where both K(n 1 ) and K(n 2 ) are not infinity; equivalently, h 1 + g 1 < C and h 2 + g 2 < C. Due to the i.i.d. distribution assumption stated in Assumption 5.2, X 1 , X 2 can be written as a single random variable uniformly distributed on [1, H max ], and Y 1 , Y 2 also can be written as a single random variable uniformly distributed on [1, H max ]. Therefore, PT(n 1 ) < PT(n 2 ) if and only if Terms in the above inequality can be computed by integration of the probability density function, S1 ρ X,Y x, y dxdy < S2 ρ X,Y x, y dxdy, where S 1 {(x, y) h 1 y + g 1 x < C, x ∈ [1, H max ], y ∈ [1, H max ]}, S 2 {(x, y) h 2 y +g 2 x < C, x ∈ [1, H max ], y ∈ [1, H max ]}. By Assumption 5.2, X ≤ C g min , Y ≤ C h min , and therefore S 1 , S 2 are two triangles, as shown in Figure 2. Due to the uniform distribution of X, Y, the above integration can be easily computed by multiplying area of S 1 , S 2 with the joint distribution ρ X,Y (x, y), which is a constant. Hence, which implies K(n 1 ) > K(n 2 ). Let the ordered sequence Γ denote a path connecting the start node s with the goal node g in the graph G. Define K max (Γ) as the highest key function value of all nodes on the path Γ, that is, K max (Γ) max n∈Γ K(n) . Lemma 5.7: The optimal path minimizes K max (Γ) over all paths in the graph.
PROOF: Let Γ p denote the optimal path that maximizes Pr(h p (n) + g p (n) < C). Suppose that there is a path Γ′ that is different from the optimal path Γ p with K max (Γ′) < K max (Γ p ), then there exists n′∈ Γ′, and n ∈ Γ p that satisfies K(n′) < K(n). By Lemma 5.6, PT(n′) > PT(n), indicating that Pr(h p (n′) + g p (n′) < C) > Pr(h p (n) + g p (n) < C), w h i c h contradicts the assumption that Γ p is the optimal path. Hence, a path is the optimal one if it minimizes K max (Γ).
Conversely, let Γ arg min Γ″∈G K max (Γ″), then for all Γ′ that is different from Γ, for all n ∈ Γ, ∃n′∈ Γ′, such that K(n) < K(n′). Thus by Lemma 5.6, PT(n) > PT(n′) for n′ in any arbitrary path that is not Γ in the graph, and hence Γ that minimizes K max (Γ) is the optimal path. Theorem 5.8: When a feasible solution exists, the proposed algorithm terminates if and only if an optimal path is found. PROOF: Algorithm 3 can only terminate by finding the goal node, or after depleting the OPEN set. However, the OPEN set can never be empty before termination if there is a feasible path from s to goal point. Hence Algorithm 3 must terminate by finding a goal point.
Next we show that Algorithm 3 terminates only by finding an optimal path to the goal node. Suppose that the algorithm terminates by finding a path, Γ′ other than the optimal path Γ p , then by Lemma 5.7, K max (Γ p ) < K max (Γ′), that is, there exists n′∈ Γ′, n ∈ Γ p such that K(n) < K(n′). Thus during the propagation process, Algorithm 3 would have selected n for expansion rather than n′, contradicting the assumption that the algorithm terminates by finding Γ′. Hence the algorithm must terminate by finding the optimal path to the goal node.

Complexity Analysis
In our analysis, we derive the worst case running time for Algorithm 3, and compare it with dynamic programming based planning methods, such as A*, to demonstrate the computational efficiency of the proposed planning algorithm. Let us suppose the flow field forecast is available on N × N grid points in the deployment domain, and suppose that the domain is partitioned into M cells by performing Algorithm 1.
To derive the worst case running time of the proposed algorithm, we first consider the partitioning. Since one junction  − 1)). The worst case running time of A* is O(2N 2 log N) (Nilsson, 2014). Thus, the proposed algorithm is more computationally efficient than A* if M(M − 1) < N 2 , meaning that Algorithm 1 partitions the domain into less number of cells than the number of rectangular cells in the original gridded domain.

EXPERIMENT AND SIMULATION RESULTS
In this section, we provide the results of the implementation of our flow field modeling and path planning methods in a simulated experiment. First, we describe the simulated experimental set-up and recent field experiments, which serve as a strong test of the methods due to the magnitude and variability of the flow. We validate the proposed flow modeling algorithm by comparing the estimated flow model parameters generated from the proposed flow estimation algorithm with the glider estimated flow collected during the experiment. Based on the estimated flow model, simulation of the bounded cost path planning algorithm is performed, and its performance is compared with other AUV path planning algorithms.

Experimental and Simulation Setup
Our study is motivated by the use of underwater gliders off the coast near Cape Hatteras, North Carolina, US as part of a 16-months experiment (Processes driving Exchange At Cape Hatteras, PEACH) to study the processes that cause exchange between the coastal and deep ocean at Cape Hatteras, a highly dynamic region characterized by confluent western boundary currents and convergence in the adjacent shelf and slope waters. Underwater gliders, AUVs that change their buoyancy and center of mass to "fly" in a sawtooth-shaped pattern, were deployed on the shelf and shelf edge to capture variability in the position of the Hatteras Front, the boundary between cool, fresh water on the shelf of the Mid Atlantic Bight and the warmer, saltier water observed in the South Atlantic Bight.
While the energy efficiency of the glider's propulsion mechanism permits endurance of weeks to months, the forward speed of the vehicles is fairly limited (0.25-0.30 m/s), which can create significant challenges for navigation in strong currents. Use of a thruster in a so-called "hybrid" glider configuration can increase forward speed to approximately 0.50 m/s, but at great energetic cost. The continental shelf near Cape Hatteras is strongly influenced by the presence of the Gulf Stream, which periodically intrudes onto the shelf, resulting in strong and spatially variable flow that can be nearly an order of magnitude greater than the forward speed of the vehicle (2+ m/s). With realistic estimates of the spatial and temporal variability of the flow, path planning can provide a significant advantage for successful sampling.
We deployed one glider off Oregon Inlet, NC May 16, 2017 and recovered it 14 days later. For its mission, the glider was initially tasked to sample along a path with offshore, inshore, FIGURE 2 | Illustration of computing PT(n 1 ) and PT(n 2 ). The red triangle is S 1 {(x, y) h 1 y + g 1 x < C, x ∈ [1, H max , y ∈ [1, H max ]]}, and the green triangle is Frontiers in Robotics and AI | www.frontiersin.org July 2021 | Volume 8 | Article 575267 and triangular segments designed to sample critical flow features (Figure 3), and was not used with a thruster. The glider surfaced approximately every 4 h to update its position with a GPS fix, communicate with shore, transmit a subset of data, and most importantly, receive mission updates and commands to adapt sampling.

Flow Modeling Using Glider Experimental Data
In this example, we present flow modeling results using the proposed flow partition and parameter estimation methods. The flow map forecast is given by a 1-km horizontal resolution version of the Navy Coastal Ocean Model  Frontiers in Robotics and AI | www.frontiersin.org July 2021 | Volume 8 | Article 575267 12 (NCOM, Martin, 2000), made available by J. Book and J. Osborne (Naval Research Laboratory, Stennis Space Center, United States). In the domain of interest, the ocean model flow forecast is given at 106 × 106 rectangular grid points. Tidal flow accounts for much of the short-term ( < 24 hour) temporal variation of the flow field. Hence the partition time interval is taken over multiple periods of the largest tidal constituent, the lunar semidiurnal M 2 tide (period 12.42 h). Maximum flow speed in this area is 2.2788 m/s, approximately 7.5 times the vehicle speed, and 4.5 times the speed of a hybrid glider using a thruster. We set the upper bound for flow partition error to be 0.35 m/s, which is about 15% of the maximum flow speed in the domain. Figure 4 describes the flow partition error in the case of different selection of cell number. Since the flow partition error goes below the upper bound when k 13, the number of cells is chosen as 13. We smooth the cell boundaries into straight lines using the Least Mean Square method. Even though smoothing the cell boundaries might overlook more detailed spatial variability of the flow field, it helps to reduce the computational cost of solving the planning problem, specifically, in solving eq. 17 and eq. 20. The partitioned flow field is shown in Figure 5. Comparing Figures 3, 5, it can be seen that the proposed algorithm captures the major spatial variation of the flow field, by separating the high speed flow regions from the area where the flow is at lower speed.
At each surfacing, the vehicle position is given by the GPS location. When the glider is underwater, we use linear interpolation to estimate the heading and vehicle position. The vehicle's forward speed is zero when it is at the surface of the ocean, and the vehicle drifts freely with the surface current. This violates the constant forward speed assumption stated in Assumption 2.1. Hence, we remove the segment of data when the vehicle is drifting at surface, and then compute the estimated flow parameters by the proposed algorithm. Glider speed is initialized to be 0.3 m/s, while the flow parameters are initialized by the flow vectors found by partitioning the NCOM data. Since the vehicle trajectory crosses cell 3, 6, 7, 10, and does not enter other cells, the glider trajectory does not satisfy persistent excitation condition described in Lemma 5.3. Hence only the flow parameters in cells 3, 6, 7, 10 can be updated by the adaptive updating law, while the flow parameters in the rest cells remain to be the initial value. To justify performance of the proposed flow parameter estimation algorithm, we use the ADCIRC (Advanced Circulation) model output (Luettich et al., 1992) to model the tidal flow component, and derive the non-tidal glider estimated flow speed by subtracting ADCIRC reported flow from the flow parameter estimate. The de-tided glider estimated flow speed is considered as the ground truth of flow parameters in the corresponding cells. The root mean square error (rmse) between the estimated parameter and the ground truth in cell 3, 6, 7, 10 is shown in Table 1. It can be seen that in all of the four cells, the estimated flow parameters is in good agreement with the true flow parameters. The rmse in all of the four cells is within 5% of the maximum flow speed in the domain. Figure 6 shows the comparison between the estimated flow parameters and the true flow parameter value in one of the cells that the glider trajectory visits. It is shown that in cell 7, the estimated flow parameter matches well with the true value.

Bounded Cost Path Planning
In this example, we present simulation results of AUV bounded cost path planning. Since the flow field is of high speed in the domain of interest, we assume that the glider is sampling the domain using combined propulsion of buoyancy and thrusters for the operation. Hence the AUV through-water speed is set to be 0.5 m/s. The simulations are run on a core i7 at 1.80 GHz PC with 32GB RAM. The travel cost of the resulting path is 62.650 h, which satisfies the bounded cost constraint. As shown in the figure, the generated path makes a detour and takes advantage of the strong northward ocean flow to travel to the goal position.
We perform A* (Carroll et al., 1992) and Level Set Method (LSM) (Lolla et al., 2014) as comparison to the proposed method. 15 test cases are generated in total. Each test case T {Start, Goal, d} is built by first assign the distance between the start and the goal node d to be 20 km, 50 km, 80, or 100 km, then randomly place the Start point in the domain, and select the Goal node so that the distance to the start node is d. The computation time column in Table 2 shows comparison of the averaged computational time for A*, LSM, and the proposed algorithm. Table 3 presents the post-hoc analysis results of the simulation. The post-hoc analysis rejects null hypotheses of the same performance, i.e. the proposed algorithm spends less computation time to solve the planning problem than the A* and the LSM method, for all different scenarios of d. The difference between the three algorithms is due to the number of nodes in the graph. By partitioning the domain into 13 cells, the proposed algorithm searches for the optimal path in a graph with only 13 × 12 nodes, while both the A* and LSM algorithm searches for the optimal path in a domain containing 106 × 106 nodes. Thus, the computational cost of the proposed algorithm is significantly lower than the other two methods.
Further, we compare the percentage of increase in the computation time when d increases. In Table 2, the % of increase column shows the increase in the computation cost when d increases from 20 to 50, 80, and 100, respectively. The percentage is calculated by considering the computation time of each algorithm when d 20 km as the base time, and divide the increase of computation time when d scales up by the base time. It can be seen that when the domain of interest scales up, the   computational cost of the proposed algorithm has the least increase, compare with A* and LSM. This is because as d increases, both A* and LSM have to expand significantly more nodes before finding the optimal solution. For the proposed algorithm, the number of nodes to be expanded stays relatively constant as d increases. Hence, its computation cost does not increase as much as A* and LSM as the domain scales up. It is worth mentioning that the proposed algorithm achieves decreased computation cost by compromising the path quality. Even though optimality of the planned path is guaranteed in the partitioned flow field, as shown in Theorem 5.8, the planned path may not be optimal in the actual flow field, since the partitioned flow field is different from the actual flow field. We identify the compromised path quality as the major constraint of the proposed algorithm.
In cases where the domain is larger, Algorithm 1 may still result in large number of cells, leading to increased computation cost in solving the bounded cost planning problem. In such scenarios, stochastic optimization methods, such as the differential evolution method, may be helpful in further reducing the computation cost of solving the planning problem. We refer to a survey paper on the differential evolution methods (Das et al., 2016) for this matter.

CONCLUSION
In this paper, a bounded cost path planning method is developed for underwater vehicles assisted by a data driven flow field modeling method. The main advantage of the proposed modified PTS method is that it is more computational efficient comparing to A* and LSM in solving AUV planning problem in time-invariant 2D fields, as demonstrated by the simulation result. Major limitation of the proposed algorithm is the compromised solution quality, resulting from the model reduction error introduced by the flow partition process. The proposed method has the potential to be extended to other path planning applications where the task performance is sensitive to planner's computational efficiency. Future work will include performing the proposed method in real glider deployments, to compare the planned trajectory with the real mission trajectory, where drift and time-varying fields happen. Future work will also include comparing the proposed method with other algorithms, such as the differential evolution algorithms.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: https://www.ncdc.noaa.gov/data-access/ model-data/model-datasets/navoceano-ncom-reg. The flow map data analyzed in this work is given by the Navy Coastal Ocean Model made available by J. Book and J. Osborne of Naval Research Laboratory, Stennis Space Center, US.

AUTHOR CONTRIBUTIONS
MH, SC, and FZ contributed to design and theoretical analysis of the methods, and wrote the first draft of the manuscript. HZ, CE and FZ wrote sections of the manuscript and provided guidance for the research work. All authors contributed to the manuscript revision, read, and approved the submitted version.