Edited by: Kathryn Hess, École Polytechnique Fédérale de Lausanne, Switzerland
Reviewed by: Raphael Reinauer, École Polytechnique Fédérale de Lausanne, Switzerland; Matteo Caorsi, L2F SA, Switzerland
This article was submitted to Machine Learning and Artificial Intelligence, a section of the journal Frontiers in Artificial Intelligence
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
The last decade saw an enormous boost in the field of computational topology: methods and concepts from algebraic and differential topology, formerly confined to the realm of pure mathematics, have demonstrated their utility in numerous areas such as computational biology personalised medicine, and time-dependent data analysis, to name a few. The newly-emerging domain comprising topology-based techniques is often referred to as topological data analysis (TDA). Next to their applications in the aforementioned areas, TDA methods have also proven to be effective in supporting, enhancing, and augmenting both classical machine learning and deep learning models. In this paper, we review the state of the art of a nascent field we refer to as “topological machine learning,” i.e., the successful symbiosis of topology-based methods and machine learning algorithms, such as deep neural networks. We identify common threads, current applications, and future challenges.
Topological machine learning recently started to emerge as a field at the interface of topological data analysis (TDA) and machine learning. It is driven by improvements of computational methods, which make the calculation of topological features (via persistent homology, for instance) increasingly flexible and scalable to more complex and larger data sets.
Topology is colloquially often referred to as encoding the overall shape of data. Hence, as a complement to localised and generally more rigid geometric features, topological features are suitable to capture multi-scale, global, and intrinsic properties of data sets. This utility has been recognised with the rise of TDA, and topological information is now generally accepted to be relevant in the context of data analysis. Numerous works aim to leverage such information to gain a fundamentally different perspective on their data sets. We want to focus on a recent “outgrowth” of TDA, i.e., the integration of topological methods to
Our survey therefore discusses this ongoing synthesis of topology and machine learning, giving an overview of recent developments in the field. As an emerging research topic, topological machine learning is highly active and rapidly developing. Our survey is therefore explicitly not intended as a formal and complete review of the field. We rather want to identify, present, and discuss some of the main directions of developments, applications, and challenges in topological machine learning as we perceive it based on our own research background. Our aim is to provide newcomers to the field with a high-level overview of some of the central developments and techniques that have been developed, highlighting some “nuggets,” and outlining common threads and future challenges. We focus on publications in major machine learning conferences (such as AISTATS, ICLR, ICML, and NeurIPS) and journals (such as JMLR) but want to note that the selection of topics and papers presented here reflects our own preferences and knowledge. In particular, we decided against the inclusion of unpublished work in this area.
The survey is broadly structured as follows: we first provide a brief mathematical background on persistent homology, one of the core concepts of topological data analysis, in section 2. Following the introduction, the main part of the survey is in section 3. Section 3.2 focuses on what we term
This section provides some background on basic concepts from algebraic topology and persistent homology. For in-depth treatments of the subject matter, we refer to standard literature (Bredon,
A basic hypothesis in data analysis which drives current research is that data has
Topology studies invariant properties of (topological) spaces under homeomorphisms (i.e., continuous transformations); in the following, we restrict ourselves to topological manifolds, so as to simplify the exposition. A fundamental problem in topology is about classification:
The
Similarly, a general
together with the
on the basis elements and extended linearly. A crucial property of the boundary maps is that they compose to 0, that is ∂
is the
A simplicial complex modelling a triangle.
Using the simplicial complex in
Persistent homology (Edelsbrunner et al.,
with filtration function
and thus precisely consist of the
Different stages of a Vietoris–Rips filtration for a simple “circle” point cloud. From left to right, connectivity of the underlying simplicial complex increases as ϵ increases.
This information can be captured in a so-called
for all
A persistence diagram containing 1-dimensional topological features (cycles).
A crucial fact that makes persistent homology valuable for application in data analysis is its
This section comprises the main part of the paper, where we gather and discuss pertinent methods and tools in topological machine learning. We broadly group the methods into the following categories. First, in section 3.2, we discuss methods that deal with
This overview figure shows examples of methods discussed in the survey and their range of influence. Green (red) boxes signify
The categorisation of the approaches discussed in the present survey.
Adams et al., |
Carrière et al., |
Gabrielsson and Carlsson, |
Chen et al., |
Bubenik, |
Kim et al., |
Khrulkov and Oseledets, |
Hofer et al., |
Carrière et al., |
Zhao and Wang, |
Zhou et al., |
Hofer C. et al., |
Carrière et al., |
Hofer et al., |
||
Kusano et al., |
Hofer et al., |
||
Reininghaus et al., |
Moor et al., |
||
Rieck et al., |
Ramamurthy et al., |
||
Rieck et al., |
Rieck et al., |
||
Umeda, |
Zhao et al., |
Our paper selection is a cross-section over major machine learning conferences and machine learning journals. We refrain from comparing methods on certain tasks—such as classification—because there is considerable heterogeneity in the experimental setup, precluding a
This section gives an overview of methods that aim at suitably representing topological features in order to use them as input features for machine learning models. We will refer to this class of methods as
Persistence diagrams (see section 2) constitute useful descriptors of homological information of data. However, being multisets, they cannot be used
Representations and kernel-based methods should ideally be efficiently computable, satisfy similar stability properties as the persistence diagrams themselves—hence exhibiting robustness properties with respect to noise—as well as provide some interpretable features. The stability of such representations is based on the fundamental stability theorem by Cohen-Steiner et al. (
Arguably the most simple form of employing topological descriptors in machine learning tasks uses
where
is the indicator function. The Betti curve was often informally used to analyse data (Umeda,
A persistence diagram
A more fundamental technique, developed by Carrière et al. (
As a somewhat more complicated, but also more expressive, representation, Bubenik (
where kmax denotes the
Computing a
The
Next, for each
In the process of generating persistence images, there are three non-canonical choices to be made. First, the choice of the weighting function, which is often chosen to emphasise features in the PD with large persistence value, next the distributions ϕ
A persistence image arises as a discretisation of the density function (with appropriate weights) supported on a persistence diagram. It permits the calculation of an increasingly better-resolved sequence of images, which may be directly used as feature vectors.
As an alternative to the previously-discussed representations, we now want to briefly focus on persistence diagrams again. The space of persistence diagrams can be endowed with metrics, such as the bottleneck distance. However, there is no natural Hilbert space structure on it, and such metrics tend to be computationally prohibitive or require the use of complex approximation algorithms (Kerber et al.,
While most of the aforementioned kernels are used to directly compare persistence diagrams, there are also examples of kernels
One of the seminal methods that built a bridge between modern machine learning techniques and TDA is a work by Hofer et al. (
This approach, as well as the development of the “DeepSets” architecture (Zaheer et al.,
where
This section reviews methods that either incorporate topological information directly into the design of a machine learning model itself, or leverage topology to study aspects of such a model. We refer to such features as
As a recent example, Moor et al. (
An approach by Chen et al. (
Hofer et al. (
As a more involved example of methods that make use of intrinsic features, Zhao et al. (
Last, to provide a somewhat complementary perspective to preceding work, a paper by Hofer et al. (
Shifting our view from regularisation techniques, topological analysis has been applied to evaluate generative adversarial networks (GANs). A GAN (Goodfellow et al.,
In a different direction, the topological analysis of the intrinsic structure of a classifier, such as a neural network, makes it possible to improve a variety of tasks. This includes the analysis of training behaviour as well as model selection—or
While the literature dedicated to the better understanding of deep neural networks has typically focused on its functional properties, Rieck et al. (
Ramamurthy et al. (
Gabrielsson and Carlsson (
This survey provided a glimpse of the nascent field of
Numerous avenues for future research exist. Of the utmost importance is the improvement of the “software ecosystem.” Software libraries such as
On the side of applications, we note that several papers already target problems such as graph classification, but they are primarily based on fixed filtrations (with the notable exception of Hofer et al. (
As another upcoming topic, we think that the analysis of time-varying data sets using topology-based methods is long overdue. With initial work by Cohen-Steiner et al. (
FH, MM, and BR performed the literature search and revised the draft. FH and BR drafted the original manuscript. All authors contributed to the article and approved the submitted version.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.