Artificial intelligence in the autonomous navigation of endovascular interventions: a systematic review

Background Autonomous navigation of catheters and guidewires in endovascular interventional surgery can decrease operation times, improve decision-making during surgery, and reduce operator radiation exposure while increasing access to treatment. Objective To determine from recent literature, through a systematic review, the impact, challenges, and opportunities artificial intelligence (AI) has for the autonomous navigation of catheters and guidewires for endovascular interventions. Methods PubMed and IEEEXplore databases were searched to identify reports of AI applied to autonomous navigation methods in endovascular interventional surgery. Eligibility criteria included studies investigating the use of AI in enabling the autonomous navigation of catheters/guidewires in endovascular interventions. Following Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA), articles were assessed using Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2). PROSPERO: CRD42023392259. Results Four hundred and sixty-two studies fulfilled the search criteria, of which 14 studies were included for analysis. Reinforcement learning (RL) (9/14, 64%) and learning from expert demonstration (7/14, 50%) were used as data-driven models for autonomous navigation. These studies evaluated models on physical phantoms (10/14, 71%) and in-silico (4/14, 29%) models. Experiments within or around the blood vessels of the heart were reported by the majority of studies (10/14, 71%), while non-anatomical vessel platforms “idealized” for simple navigation were used in three studies (3/14, 21%), and the porcine liver venous system in one study. We observed that risk of bias and poor generalizability were present across studies. No procedures were performed on patients in any of the studies reviewed. Moreover, all studies were limited due to the lack of patient selection criteria, reference standards, and reproducibility, which resulted in a low level of evidence for clinical translation. Conclusion Despite the potential benefits of AI applied to autonomous navigation of endovascular interventions, the field is in an experimental proof-of-concept stage, with a technology readiness level of 3. We highlight that reference standards with well-identified performance metrics are crucial to allow for comparisons of data-driven algorithms proposed in the years to come. Systematic review registration identifier: CRD42023392259.


Introduction
Cardiovascular (CV) diseases are the most common cause of death across Europe, accounting for more than 4 million deaths each year, with coronary heart disease (44.2%) and cerebrovascular disease (25.4%) emerging as the predominant contributors to CV-related mortality across all ages and genders [1].Endovascular catheter-based interventions such as percutaneous coronary intervention (PCI), pulmonary vein isolation (PVI) and mechanical thrombectomy (MT) have become an established treatment for CV diseases [2,3,4,5].During such a procedure, an operator navigates a guidewire and catheter from an insertion point (typically the common femoral or radial artery) to the area of interest to perform the intervention.Intraoperative fluoroscopy is used intermittently throughout the navigation and intervention to guide the catheter and guidewire through the vasculature.Once the target site has been reached, the treatment can be performed through the catheter.This is typically thrombus removal in the case of MT, stent deployment in the case of PCI, and ablation for PVI [6].
In acute CV disease, time from symptom onset to treatment is often crucial for effective endovascular interventions.For example, the benefits of MT become non-significant after 7.3 hours of stroke for non-stratified patients [7].As a result, in the UK for example, only 1.4% of stroke admissions benefit from MT despite the 10% of patients that are eligible for treatment [8].Other challenges for endovascular interventions relate to occasional complications including perforation, thrombosis and dissection in the parent artery, as well as distal embolization of thrombus [9].Moreover, angiography requires intravascular contrast agent administration, which can occasionally lead to nephrotoxicity [10].For operators and their teams, the high cumulative dose of x-ray radiation from angiography is a risk factor for cancer and cataracts [11].Although exposure can be minimised with current radiation protection practice, some measures involve operators wearing heavy protective equipment which is a risk factor for orthopaedic complications, and so alternative methods of exposure reduction are beneficial [12,13].
It is hoped that robotic surgical systems can either mitigate or eliminate some of the challenges currently presented by endovascular interventions.For example, robotic systems could be set up in hospitals nationwide and tele-operated remotely from a central location, increasing the speed of access to treatments such as MT beyond what is possible currently [14].Additionally, robotic systems might eliminate any operator physiological tremors or fatigue and allow endovascular interventions to be performed in an optimum ergonomic position while potentially increasing procedural precision (for example, procedure time), and thereby improving overall performance scores and reducing complication rates [15].Furthermore, as operators would not be required to stand next to the patient, their radiation exposure would be reduced and the need to wear heavy protective equipment would be obviated.
Commercial robotic systems are currently available to perform endovascular interventions.Hansen Medical developed the Magellan TM system (Auris Health, Redwood City, USA), the first commercially available robotic system to be used for PVI, and more recently used to successfully perform carotid artery stenting in 13 patients [16,17].This system comprises a steerable guide catheter inside a steerable sheath allowing movement in three dimensions, and a separate remote guidewire manipulator allowing linear and rotational movement.The Corpath GRX ® (Corindus Vascular Robotics, USA), the next-generation system of the Corpath ® 200 robot, has successfully been used for PCI and PVI.This system has performed diagnostic cerebral angiography procedures and ten carotid artery stenting procedures [18,19,20].Furthermore, it has been recently used to perform robot-assisted, neuroendovascular interventions including aneurysm embolisation and epistaxis embolisation [21,22,23].These systems use a controller-operator structure, where operators remotely control and navigate a robot through a patient's vasculature to the target site.In currently available systems, the operator has complete control over the robot and makes all of the decisions.
While these robotic systems help alleviate some of the challenges of endovascular interventions, they have limitations.The controller-operator structure requires a reasonably high cognitive workload, can still result in human error and means that the procedure is limited to an individual operator's skill set [24].These robotic systems also consist of user interfaces such as buttons and joysticks, requiring skills that are different to those used in current clinical practice.Additionally, a lack of haptic feedback from robotic systems might result in difficulties to receive tactile feedback from the catheters and guidewires as they interact with vessel walls [14].
One emerging method of mitigating these challenges is using artificial intelligence (AI) techniques in conjunction with robotic systems.AI, and in particular, machine learning (ML), has accelerated in recent years in its applications for data analysis and learning [25], with many areas of healthcare already making use of this technology for disease prediction and diagnosis [26,27].ML algorithms can be divided into three main groups: supervised, unsupervised, and reinforcement learning (RL).Supervised learning is the most common form of ML and involves constructing a model trained on a dataset with labels (the corresponding correct outputs).The model can then accurately predict the labels of new, unknown instances based on the patterns learned from the training data [28].
Unsupervised learning involves training an algorithm to represent particular input features in a way that reflects the structure of the overall collection of input patterns [29].In contrast to other types of ML, the dataset is unlabelled and there are no explicit target outputs or environmental evaluations associated with each input.
RL is a form of ML, whereby an agent learns by interacting with the environment and receiving feedback in the form of rewards.The goal of RL is to maximise the cumulative reward over time by learning a policy that optimises the agent's current state for a set of actions [30].Similar to the natural way of human learning, robotic RL automatically acquires the skills through 'trials and errors' [31].Applications of RL are becoming more expansive, as numerous research areas aim to use the method, for example, in precision medicine, medical imaging, and rehabilitation [32,33,34].
Learning from demonstration (LfD) is a variant of supervised learning, where input data is provided by an expert demonstrator.This can also act as a precursor for RL, whereby the agent can further improve its behaviour through interaction with the environment.Table 1 describes the ML methods that are referred to later in this paper, each of which can be used to improve performance across the three types of ML described above.LfD has been separated from the other types of ML in this case, as it can be used in the context of both supervised learning and RL.

CNN Supervised learning
Type of deep neural network specifically designed for image processing and pattern recognition tasks.CNNs leverage spatial hierarchies through convolutional layers that extract local features and preserve spatial relationships, enabling effective image classification, object detection, and image segmentation tasks [37].
DDPG RL An algorithm that merges RL and policy optimisation.It iteratively refines the policy based on estimated value distributions, to find an optimal strategy.[38].

DQN RL
Leverages a deep neural network to learn optimal policies through Q-learning (see Q-learning explanation below).It enables agents to make decisions by maximising the expected cumulative rewards, facilitating dynamic environment interaction [39].

Dueling DQN RL
An extension of DQN that separates the estimation of state value and action advantages.By independently approximating these values, the agent can learn the value of being in a particular state while also considering the advantages of each action [40].

GAIL LfD
Method where an agent learns a policy by imitating expert behaviour using a generative adversarial framework.It involves a generator network that aims to replicate the expert and a discriminator network that distinguishes between expert and generated behaviour [41].

GMM
Unsupervised learning A statistical model that assumes data is generated by a mixture of several Gaussian distributions [42].

HD LfD
Term that encompasses the process of an expert performing a task.Human demonstration can be used as a means to collect data for LfD [43].HER RL Allows an agent to learn from "failed" experiences by redefining the goal of a task [44].

HMM Unsupervised learning
A statistical model that assumes observations are generated by a hidden sequence of states that follow a Markov process [45].
PI 2  RL Optimisation algorithm which aims to find the optimal policy by iteratively improving the policy through gradient-based optimisation methods, maximising the expected return [46].
PPO RL An algorithm that optimises policies iteratively while ensuring small policy updates.It balances exploration and exploitation, enhancing stability and sample efficiency during training [47].
Q-learning RL Algorithm that learns the optimal action-value function (Q-value function) for sequential decision-making.It updates Q-values iteratively based on observed rewards and the maximum expected future rewards [48].

Rainbow RL
Extension of DQN that combines multiple improvements to enhance performance, by incorporating techniques such as prioritised experience replay, distributional value estimation, and multi-step learning to improve overall learning stability and efficiency [49].

YOLO Supervised learning
Object detection algorithm that can detect and classify objects in real-time.It uses a single neural network to directly predict bounding boxes and class probabilities for objects in an image, providing fast and accurate object detection [50].A3C, asynchronous advantage actor critic; CNN, convolutional neural network; DDPG, deep deterministic policy gradient; DQN, deep Q-network; GAIL, generative adversarial imitation learning GMM, Gaussian mixture modelling; HD, human demonstration; HER, hindsight experience replay; HMM, hidden Markov models; LfD, learning from demonstration; PI2, policy improvement with path integrals; PPO, proximal policy optimisation; RL, reinforcement learning; YOLO, you only look once.
The use of these ML techniques for autonomy in medical robotics presents several challenges.To help in the consideration of regulatory, ethical, and legal barriers imposed, a six-level autonomy framework has been proposed, ranging from no autonomy at level 0, up to level 5 which involves full autonomy with no human intervention [51].This study aims to systematically review the methodology, performance and autonomy level of AI applied to the autonomous navigation of catheters and guidewires for endovascular interventions.Understanding the current developments in the field will help determine the impact, challenges, and opportunities required to direct future translational research and ultimately guide clinical practice.

Methods
This systematic review is PROSPERO (International prospective register of systematic reviews) registered (CRD42023392259).The review followed Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [52].

Eligibility criteria
Included reports consisted of primary research studies, which investigated the use of AI in enabling the autonomous navigation of catheters and/or guidewires in endovascular interventions.Excluded studies did not use AI methods to achieve autonomous navigation of catheters/guidewires or looked at path planning for endovascular interventions rather than the navigation itself.Additionally, studies without an English translation were not included [53].

Information sources and search strategy
PubMed and IEEEXplore were used to capture original research articles, published anytime until the end of January 2023, with the following search query: "(Artificial Intelligence OR Machine Learning OR Reinforcement Learning OR Deep Learning OR Autonomous OR Learning-based) AND (Endovascular OR Vascular Intervention OR Catheter OR Guidewire) AND (Navigation OR Guidance)".Pre-prints and non-peer-reviewed articles were excluded.

Selection and data collection process
A medical robotics data scientist, H.R. (3 years of research experience), searched for studies as defined in the search strategy and followed the selection process as shown in Figure 1.A medical robotics data scientist, L.K. (4 years experience in autonomous endovascular navigation using AI), independently reviewed the manuscripts against the eligibility criteria.In the case of discrepancy, consensus was reached by discussion between the two reviewers.If consensus was not reached, the multi-disciplinary authorship would make the final arbitration.The relevant data items, as defined in the following section, were extracted.

Data items, effect measures and synthesis methods
Information extracted from each study included: the AI method used and more granular model details (where available), the current level of autonomy, the type of experiment (in vivo, in vitro, in silico), the method of tracking the catheter and/or guidewire position, the method of catheter and/or guidewire manipulation, description of the navigation path, performance measures, and key performance outcomes (where available).The levels of autonomy followed [51].Briefly, these are level 0: no autonomy, level 1: robot assistance, level 2: task autonomy, level 3: conditional autonomy, level 4: high autonomy, and level 5: full autonomy.It should be noted that if the autonomy level was not described in the study, an appropriate level was assigned based on the content of the paper.

Study risk of bias, reporting bias and certainty assessment
Where appropriate, both Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) methodology alongside AI metrics from the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) were used to assess the risk of bias for each study [54,55].

Studies
As shown in Figure 1, 462 studies met the search criteria, and 21 full-text studies were assessed against the eligibility criteria.A total of 14 were identified for review [56,57,58,59,60,61,62,63,64,65,66,67,68,69].The characteristics of the fourteen studies are listed in Table 2.According to QUADAS-2 methodology, all studies reviewed gave a high or unclear 'risk of bias' and 'concerns regarding applicability' in all domains.No studies performed procedures on patients and therefore had no clearly defined patient selection criteria, reference standards, or index tests.Despite the low level of evidence, there is value in discussing these individual studies as they represent the current state of the art and form a baseline for further research.
In silico methods were used in two of the studies (2/14, 14%) (one used SOFA framework and one used Unity engine) [63,62].Ex vivo experiments using porcine liver vasculature were reported by one study [60].Here, in silico methods were used for training models before the ex vivo experiments.

Evaluation
Passive tracking relies on external sensors to detect the catheter's position, active tracking involves the use of sensors located at the distal end of the catheter for real-time position tracking, and magnetic tracking utilizes external magnetic fields to guide the catheter's movement and track its position.A passive, tracking-based, method for catheter manipulation was used in eight studies (8/14, 57%) [56,61,66,69,63,62,57,59], whereas a passive, image-based, method for catheter manipulation was used in the other six studies (6/14, 43%) [67,65,64,60,58,68].None of the studies reviewed reported active or magnetic steering methods.
A top-down camera for tracking the location of the guidewire and/or catheter was implemented in five of the studies (5/14, 36%) where transparent phantoms allowed real-time video to provide software-generated tracking data [67,65,64,58,68].Electromagnetic (EM) position sensors were employed in six studies (6/14, 43%) [61,66,69,57,59,56].An Aurora control unit and EM Generator of Aurora electromagnetic tracking system (NDI, Waterloo, Canada) were used in one of these studies [66], whilst custom-designed sensors [57] were used in the other five.These five studies also employed a top-down camera simultaneously enabled through the use of transparent phantoms during data collection pre-training.One study employed continuous fluoroscopy, capturing 7.5 images per second, and used a CNN to segment the guidewire from real-time fluoroscopy images to track data that included the coordinates [60].Two studies (2/14, 14%) were performed entirely in silico, and hence no tracking method was required [63,62].
Quantitative performance measures used in the studies were heterogeneous which may reflect the low technology readiness level (TRL) [71] of AI applied to autonomous navigation of endovascular interventions shown by the studies in this systematic review.Common performance measures used were success rate of navigation task (7/14, 50%) and time to complete procedure (5/14, 36%).Other performance measures shared across studies were: measures of force (6/14, 43%); acceleration (4/14, 29%); various measures of speed (4/14, 29%); and path length (4/14, 29%).Half of the studies (7/14, 50%) reviewed compared manual performance against their autonomous navigation performance.The key performance outcomes of the fourteen studies are listed in Table 3.

Summary of findings
There is no high-level evidence [72] to demonstrate that AI autonomous navigation of catheters and guidewires in endovascular intervention is non-inferior or superior to manual procedures.Currently, AI autonomous navigation of catheters and guidewires in endovascular intervention has not surpassed TRL 3.There has been no clinical validation nor has there been comprehensive laboratory validation.Over half of the studies (9/14, 64%) employed RL methodologies, particularly in recent years, where most studies used RL (8/10, 80% published beyond 2018).There are no standardised in silico, in vitro or ex vivo experimental reference standard designs, nor are there standardised performance measures, meaning comparison of studies quantitatively is of limited value.

Strengths
The primary strength of the studies reviewed came from the range of ML techniques employed.Most focused on finding a ML technique that would improve upon previous work, rather than using similar algorithms and extending the experimental environment.This is demonstrated well within the nine studies (9/14, 64%) which used RL, where a different ML-based methodology was used in every case except for two (where the simulation environment and output measurements were changed between studies).Exploring various techniques is advantageous for research, especially in the rapidly evolving field of ML, as the fast pace of development increases the likelihood that more effective algorithms are created.For example, autonomous endovascular intervention progress has been catalysed by combining two recent approaches (LfD and RL) [61,69,65,64].Here, using demonstrator data in a third of the RL studies allowed expert operator skill in complex endovascular procedures to be incorporated.This proficiency can be leveraged effectively to accelerate the RL training process.The combined approach, therefore, shortens the transition from a simulated training environment to a physical testing environment which typically presents significant challenges, as evidenced by the findings of [60].Another benefit of accelerating the process is that in some scenarios thousands of mechanical experimental training cycles may no longer be required leading to reduced mechanical wear on the experimental equipment.

Limitations
The limitations of the studies assessed encompassed three areas: Whilst it was a strength that most studies focused on finding a ML technique that would improve upon previous endovascular navigation, the lack of focus on using similar or fixed algorithms and extending the experimental environment was a limitation.The challenge of fixing many experimental variables whilst changing another, is compounded by the lack of standardised in silico, in vitro or ex vivo experimental reference standard designs for endovascular navigation, as well as a lack of standardised performance measures.As such, the ability to compare studies quantitatively was limited by confounding.For example, although some performance measures (e.g., 'success rate' and 'procedure time') were common to several studies, study comparison was limited due to experimental variations between studies.Firstly, the navigation path used to test the models varied.Secondly, some studies defined 'success rate' only if a task was completed within a certain time frame, whereas others had no time limit for completion.Thirdly, 'procedure time' was measured using different starting points and target sites.
Another limitation, also concerned with reference standards, is the importance of comparing the endovascular navigation with an autonomous system against the endovascular navigation without an autonomous system, to determine any incremental benefit through autonomy.Critically, the endovascular navigation without an autonomous system should ideally be operated by a relevant expert operating with minimal technical constraint to derive the reference standard (baseline) allowing comparison.Half the studies (7/14, 50%) reviewed did compare endovascular navigation with and without an autonomous system; however, in some cases, the operator was technically constrained by using a novel robotic system rather than using the equipment used and processes they would typically employ, during an endovascular procedure in the clinic.For example, the robotic systems used in the studies reviewed failed to mimic the haptic feedback that the operator would receive performing procedures manually, such as viscous forces between the catheters and the blood; friction forces between the catheters and the vessel wall, and impact forces from the tips of the catheter and guidewire, and the vessel wall [14].Additionally, an expert is not able to use their previous experience with standard equipment and may be unfamiliar with these controls, meaning that performance at a given task will likely be affected.There were no clinical studies of autonomous endovascular navigation which is a reflection of the nascent field and current TRL of the technology.The majority of studies (11/14, 79%) were in vitro and are valuable for development and testing as they limit the number of failures during subsequent in vivo testing [73].However, these studies did not consider whether construct, face, and context validity of endovascular navigation systems was acceptable to allow TRL progression towards the clinic.In particular, in many of the studies reviewed, there were translational concerns regarding how the guidewires and/or catheters are tracked within the vasculature, as the alternative to using fluoroscopy with standard off-the-shelf catheters and guidewires is to create entirely new tracking methods.For example, several papers (6/14, 43%) used EM-tracking to visualise the catheter in real-time, which has been shown to allow better real-time 3D orientation, facilitating navigation, reducing cannulation and total fluoroscopy times, and improving motion consistency and efficiency [74].However, clinical translation using this method would require the introduction of new systems with specialised catheters and guidewires, resulting in additional costs and training.Furthermore, other studies (5/14, 36%) employed an experimental set-up involving a tabletop with a transparent phantom and a top-down camera.In its current state, this tracking method would not be suitable for future clinical studies, as a top-down camera would not be able to provide images of the guidewire and/or catheter through patient tissue.Nonetheless, it is noted that top-down cameras have a narrower clinical translation gap than EM-tracking, as they pose the same 2D challenges as fluoroscopy.

Final thoughts and future research
Using AI, it may be possible to create a robotic system capable of autonomously navigating catheters and wires through a patient's vasculature to the target site, requiring minimal assistance from an operator.If proven to be safe and effective in clinical trials, the benefits of autonomous navigation are numerous.It is plausible that in clinical specialities facing a shortage of highly-trained operators, there may be a reduced need for their expertise, potentially leading to greater accessibility of endovascular treatments globally, such as MT.For example, components of MT such as complex navigation tasks could be performed autonomously.Furthermore, autonomous systems are not limited by human factors such as fatigue or loss of focus, potentially making procedures safer and quicker [75].
The concept of fully autonomous navigation in endovascular interventions is promising; however, with a TRL level of 3 [71], the technology is yet to complete validation even in a laboratory environment.Due to the inadequate evidence supporting its use (the limited number of studies and its low-level) [72], it is far from being used in clinical practice.It first must be demonstrated that it can reliably provide benefits over currently available treatments before it can progress towards clinical trials.
Importantly, reference standards for endovascular navigation models need to be established to allow new models to be compared.This would allow effective comparison of different AI methods to determine the most effective model for autonomous endovascular navigation.These reference standards need to be established judiciously at the in silico, in vitro, and ex vivo level with carefully-defined environments for different endovascular tasks such as PCI, PVI, and MT.It is noteworthy that at the in silico level, where there are continuous advancements in modelling research and increased computational power, other areas of clinically-orientated ML research have successfully employed reference standards to enable reproducibility of results and comparability between competing models [76,77].This includes computer vision (ImageNet Large Scale Visual Recognition Challenge) and natural language processing (National NLP Clinical Challenges).Furthermore, a set of minimum reporting standards of performance should be defined for studies investigating the use of AI in the autonomous navigation of endovascular interventions.In combination with a reference standard, this would allow complete comparison between ML algorithms designed for this specific task.
Clear regulation is required to determine how the community designs systems for the autonomous navigation in endovascular interventions.In the seven studies (7/14, 50%) which proposed a system with 'level 3' autonomy, there is an expert operator in place who can intervene in the autonomous task if needed ('human in the loop').It may be prudent, for now, for researchers to focus on optimising systems with 'level 1-3' autonomy as at higher levels of autonomy, particularly 'level 5' and potentially 'level 4', it is unclear how systems, where the robot can make decisions, will be regulated.As such future researchers may wish to optimise simple task autonomy, for example the autonomous navigation from the puncture point to the target site, in a system where an operator can stop the procedure and take over at any time.It is envisaged that as autonomous technology and regulations mature over time, systems will then be updated to carry out more difficult tasks.
Various AI methods have been used to investigate the possibility of autonomous navigation in endovascular interventions.Although it is plausible that autonomous navigation may eventually benefit patients while reducing occupational hazards for staff, there is currently no high-level evidence to support this assertion.For the technology to progress, reference standards and minimum reporting standards need to be established to allow meaningful comparisons of new system development.

Figure 1 :
Figure 1: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram showing the number of articles searched and excluded at each stage of the literature search after screening titles, abstracts, and full texts.

Figure 2 :
Figure 2: Diagram depicting the general vessels of interest for each study.*Study is in more than one area.Studies using non-anatomical platforms are also shown.

Table 1 :
Description of ML methods.

Table 2 :
Studies resulting from our search and eligibility criteria proposing AI models for the autonomous navigation of catheters/guidewires in endovascular interventions.LfD was used as a ML method in cases where no further information about the type of LfD was available.Descriptions of each type of ML method can be found in Table1.