<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="review-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Artif. Intell.</journal-id>
<journal-title>Frontiers in Artificial Intelligence</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Artif. Intell.</abbrev-journal-title>
<issn pub-type="epub">2624-8212</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/frai.2021.681108</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Artificial Intelligence</subject>
<subj-group>
<subject>Review</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A Survey of Topological Machine Learning Methods</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Hensel</surname> <given-names>Felix</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1272032/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Moor</surname> <given-names>Michael</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1072566/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Rieck</surname> <given-names>Bastian</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1062708/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Machine Learning and Computational Biology Laboratory, ETH Zurich</institution>, <addr-line>Zurich</addr-line>, <country>Switzerland</country></aff>
<aff id="aff2"><sup>2</sup><institution>Swiss Institute of Bioinformatics</institution>, <addr-line>Lausanne</addr-line>, <country>Switzerland</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Kathryn Hess, &#x000C9;cole Polytechnique F&#x000E9;d&#x000E9;rale de Lausanne, Switzerland</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Raphael Reinauer, &#x000C9;cole Polytechnique F&#x000E9;d&#x000E9;rale de Lausanne, Switzerland; Matteo Caorsi, L2F SA, Switzerland</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Bastian Rieck <email>bastian.rieck&#x00040;bsse.ethz.ch</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Machine Learning and Artificial Intelligence, a section of the journal Frontiers in Artificial Intelligence</p></fn></author-notes>
<pub-date pub-type="epub">
<day>26</day>
<month>05</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>4</volume>
<elocation-id>681108</elocation-id>
<history>
<date date-type="received">
<day>15</day>
<month>03</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>13</day>
<month>04</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2021 Hensel, Moor and Rieck.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Hensel, Moor and Rieck</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>The last decade saw an enormous boost in the field of computational topology: methods and concepts from algebraic and differential topology, formerly confined to the realm of pure mathematics, have demonstrated their utility in numerous areas such as computational biology personalised medicine, and time-dependent data analysis, to name a few. The newly-emerging domain comprising topology-based techniques is often referred to as topological data analysis (TDA). Next to their applications in the aforementioned areas, TDA methods have also proven to be effective in supporting, enhancing, and augmenting both classical machine learning and deep learning models. In this paper, we review the state of the art of a nascent field we refer to as &#x0201C;topological machine learning,&#x0201D; i.e., the successful symbiosis of topology-based methods and machine learning algorithms, such as deep neural networks. We identify common threads, current applications, and future challenges.</p></abstract>
<kwd-group>
<kwd>computational topology</kwd>
<kwd>persistent homology</kwd>
<kwd>machine learning</kwd>
<kwd>topology</kwd>
<kwd>survey</kwd>
<kwd>topological machine learning</kwd>
</kwd-group>
<contract-sponsor id="cn001">Schweizerischer Nationalfonds zur F&#x000F6;rderung der Wissenschaftlichen Forschung<named-content content-type="fundref-id">10.13039/501100001711</named-content></contract-sponsor>
<counts>
<fig-count count="7"/>
<table-count count="1"/>
<equation-count count="14"/>
<ref-count count="65"/>
<page-count count="12"/>
<word-count count="9485"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Topological machine learning recently started to emerge as a field at the interface of topological data analysis (TDA) and machine learning. It is driven by improvements of computational methods, which make the calculation of topological features (via persistent homology, for instance) increasingly flexible and scalable to more complex and larger data sets.</p>
<p>Topology is colloquially often referred to as encoding the overall shape of data. Hence, as a complement to localised and generally more rigid geometric features, topological features are suitable to capture multi-scale, global, and intrinsic properties of data sets. This utility has been recognised with the rise of TDA, and topological information is now generally accepted to be relevant in the context of data analysis. Numerous works aim to leverage such information to gain a fundamentally different perspective on their data sets. We want to focus on a recent &#x0201C;outgrowth&#x0201D; of TDA, i.e., the integration of topological methods to <italic>enhance</italic> or <italic>augment</italic> both classical machine learning methods and deep learning models.</p>
<p>Our survey therefore discusses this ongoing synthesis of topology and machine learning, giving an overview of recent developments in the field. As an emerging research topic, topological machine learning is highly active and rapidly developing. Our survey is therefore explicitly not intended as a formal and complete review of the field. We rather want to identify, present, and discuss some of the main directions of developments, applications, and challenges in topological machine learning as we perceive it based on our own research background. Our aim is to provide newcomers to the field with a high-level overview of some of the central developments and techniques that have been developed, highlighting some &#x0201C;nuggets,&#x0201D; and outlining common threads and future challenges. We focus on publications in major machine learning conferences (such as AISTATS, ICLR, ICML, and NeurIPS) and journals (such as JMLR) but want to note that the selection of topics and papers presented here reflects our own preferences and knowledge. In particular, we decided against the inclusion of unpublished work in this area.</p>
<p>The survey is broadly structured as follows: we first provide a brief mathematical background on persistent homology, one of the core concepts of topological data analysis, in section 2. Following the introduction, the main part of the survey is in section 3. Section 3.2 focuses on what we term <italic>extrinsic topological features</italic> in machine learning. These methods are mainly concerned with the transformation of topological descriptors of data into feature vectors of fixed dimensionality, permitting their use as features in machine learning frameworks. This is in contrast to <italic>intrinsic topological features</italic>, portrayed in section 3.3, which employ topological features to analyse or influence the machine learning model itself, for instance by architectural choices or regularisation. Finally, section 4 discusses future directions and challenges in topological machine learning.</p>
</sec>
<sec id="s2">
<title>2. Background on Algebraic Topology and Persistent Homology</title>
<p>This section provides some background on basic concepts from algebraic topology and persistent homology. For in-depth treatments of the subject matter, we refer to standard literature (Bredon, <xref ref-type="bibr" rid="B6">1993</xref>; Hatcher, <xref ref-type="bibr" rid="B28">2000</xref>; Edelsbrunner and Harer, <xref ref-type="bibr" rid="B22">2010</xref>). Readers familiar with algebraic topology and the concept of persistent homology may safely skip this section.</p>
<p>A basic hypothesis in data analysis which drives current research is that data has <italic>shape</italic>, or put differently, that data is sampled from an underlying manifold&#x02014;the so-called &#x0201C;manifold hypothesis&#x0201D; (Fefferman et al., <xref ref-type="bibr" rid="B24">2013</xref>). Instead of restricting the analysis to statistical descriptors, <italic>topological data analysis</italic> (TDA) aims to analyse data from a fundamentally different perspective by investigating this underlying manifold structure in an algebraic fashion. Namely, one computes descriptors of data sets which are <italic>stable</italic> under perturbation and encode intrinsic <italic>multi-scale</italic> information on the their shape. TDA is a rapidly developing field of mathematics aiming to leverage concepts of the well-established field of (algebraic) topology toward applications for real-world data sets and machine learning.</p>
<p>Topology studies invariant properties of (topological) spaces under homeomorphisms (i.e., continuous transformations); in the following, we restrict ourselves to topological manifolds, so as to simplify the exposition. A fundamental problem in topology is about classification: <italic>How can two manifolds be distinguished from each other?</italic> Algebraic topology (Bredon, <xref ref-type="bibr" rid="B6">1993</xref>; Hatcher, <xref ref-type="bibr" rid="B28">2000</xref>) provides sophisticated and powerful tools to study this question. The basic idea being to associate computable <italic>algebraic structures</italic> (e.g., groups or vector spaces) to a manifold that remain <italic>invariant</italic> under homeomorphisms. A very important class of algebraic invariants are the <italic>homology groups</italic>, which encode a great deal of information while still being efficiently computable in many cases. Homology groups arise from combinatorial representations of the manifold, the <italic>chain complexes</italic>.</p>
<sec>
<title>2.1. Chain Complexes and Homology</title>
<p>The <italic>standard</italic> <italic>k</italic><italic>-simplex</italic> &#x00394;<sup><italic>k</italic></sup> is defined as the convex hull of the standard basis vectors in &#x0211D;<sup><italic>k</italic>&#x0002B;1</sup>, i.e.,</p>
<disp-formula id="E1"><mml:math id="M1"><mml:msup><mml:mrow><mml:mo>&#x00394;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="true">|</mml:mo></mml:mrow><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mtext>&#x02003;</mml:mtext><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02265;</mml:mo><mml:mn>0</mml:mn><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x02200;</mml:mo><mml:mi>i</mml:mi></mml:mrow><mml:mo>}</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:math></disp-formula>
<p>Similarly, a general <italic>k</italic><italic>-simplex</italic> [<italic>v</italic><sub>0</sub>, &#x02026;, <italic>v</italic><sub><italic>k</italic></sub>] is the convex hull of <italic>k</italic> &#x0002B; 1 affinely independent points <italic>v</italic><sub>0</sub>, &#x02026;, <italic>v</italic><sub><italic>k</italic></sub> in a Euclidean space. Note that deleting one of the <italic>vertices</italic> <italic>v</italic><sub><italic>i</italic></sub> from a <italic>k</italic>-simplex [<italic>v</italic><sub>0</sub>, &#x02026;, <italic>v</italic><sub><italic>k</italic></sub>] yields a (<italic>k</italic> &#x02212; 1)-simplex <inline-formula><mml:math id="M2"><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> which is determined by the remaining vertices and called the <italic>i</italic><italic>-th face</italic> of [<italic>v</italic><sub>0</sub>, &#x02026;, <italic>v</italic><sub><italic>k</italic></sub>]. Simplices are the basic building blocks of chain complexes that are used in algebraic topology for the computation of homological invariants. Any <italic>topological manifold</italic> <italic>X</italic> can be topologically modelled using simplices (see <xref ref-type="fig" rid="F1">Figure 1</xref>). A <italic>singular</italic> <italic>k</italic><italic>-simplex</italic> in <italic>X</italic> is a continuous map &#x003C3;:&#x00394;<sup><italic>k</italic></sup> &#x02192; <italic>X</italic>. It is not required that &#x003C3; is an embedding, for instance any constant map, mapping so a single point in <italic>X</italic> is a valid singular simplex. The inclusion of the <italic>i</italic>-th face of &#x00394;<sup><italic>k</italic></sup> is an important singular simplex in &#x00394;<sup><italic>k</italic></sup>, which we will denote by <inline-formula><mml:math id="M3"><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>:</mml:mo><mml:msup><mml:mrow><mml:mo>&#x00394;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x02192;</mml:mo><mml:msup><mml:mrow><mml:mo>&#x00394;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>. To keep the exposition simple we will restrict ourselves to working over the two element field &#x1D53D;<sub>2</sub>: &#x0003D; &#x02124;/2&#x02124; in what follows. Given any space <italic>X</italic>, its <italic>singular</italic> <italic>k</italic><italic>-chains</italic> are elements of the &#x1D53D;<sub>2</sub>-vector space <italic>C</italic><sub><italic>k</italic></sub>(<italic>X</italic>) generated by the set of all singular <italic>k</italic>-simplices in <italic>X</italic>. Elements in <italic>C</italic><sub><italic>k</italic></sub>(<italic>X</italic>) are thus &#x0201C;formal sums&#x0201D; of simplices. The <italic>singular chain complex</italic> (<italic>C</italic>(<italic>X</italic>), &#x02202;) of <italic>X</italic> is the sequence of spaces</p>
<disp-formula id="E2"><mml:math id="M4"><mml:mo>&#x02026;</mml:mo><mml:mover class="overset"><mml:mrow><mml:mo>&#x02192;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x02202;</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mover><mml:msub><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mover class="overset"><mml:mrow><mml:mo>&#x02192;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x02202;</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mover><mml:msub><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mover class="overset"><mml:mrow><mml:mo>&#x02192;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x02202;</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mover></mml:math></disp-formula>
<disp-formula id="E3"><mml:math id="M5"><mml:mo>&#x02026;</mml:mo><mml:mover class="overset"><mml:mrow><mml:mo>&#x02192;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x02202;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mover><mml:msub><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mover class="overset"><mml:mrow><mml:mo>&#x02192;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x02202;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mover><mml:msub><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mover class="overset"><mml:mrow><mml:mo>&#x02192;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x02202;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mover><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:math></disp-formula>
<p>together with the <italic>boundary maps</italic> &#x02202;<sub><italic>k</italic></sub> : <italic>C</italic><sub><italic>k</italic></sub>(<italic>X</italic>) &#x02192; <italic>C</italic><sub><italic>k</italic>&#x02212;1</sub>(<italic>X</italic>) given by</p>
<disp-formula id="E4"><mml:math id="M6"><mml:msub><mml:mrow><mml:mi>&#x02202;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mi>&#x003C3;</mml:mi><mml:mo>&#x025E6;</mml:mo><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:math></disp-formula>
<p>on the basis elements and extended linearly. A crucial property of the boundary maps is that they compose to 0, that is &#x02202;<sub><italic>k</italic></sub> &#x025E6; &#x02202;<sub><italic>k</italic>&#x02212;1</sub> &#x0003D; 0. Elements of <italic>Z</italic><sub><italic>k</italic></sub>(<italic>X</italic>): &#x0003D; ker(&#x02202;<sub><italic>k</italic></sub>) are called <italic>k</italic><italic>-cycles</italic> and those of <italic>B</italic><sub><italic>k</italic></sub>(<italic>X</italic>): &#x0003D; im(&#x02202;<sub><italic>k</italic>&#x0002B;1</sub>) are called <italic>k</italic><italic>-boundaries</italic> and their well-defined quotient</p>
<disp-formula id="E5"><mml:math id="M7"><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:msub><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></disp-formula>
<p>is the <italic>k</italic><italic>-th singular homology group</italic> of <italic>X</italic> (despite the name, this is still technically a quotient vector space; however, the group-theoretical viewpoint is more convenient and prevalent in algebraic topology). The homology groups are <italic>topological invariants</italic>, i.e., they remain invariant under homeomorphisms and therefore encode intrinsic information on the topology of <italic>X</italic>. Thus, homology groups and simpler invariants derived from them, such as the <italic>Betti-numbers</italic> &#x003B2;<sub><italic>k</italic></sub>: &#x0003D; dim <italic>H</italic><sub><italic>k</italic></sub>(<italic>X</italic>), are useful in studying the classification question raised above. For example, the 0-th Betti number &#x003B2;<sub>0</sub> is a count of the connected components of a space, while &#x003B2;<sub>1</sub> is a count of the number of cycles.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>A simplicial complex modelling a triangle.</p></caption>
<graphic xlink:href="frai-04-681108-g0001.tif"/>
</fig>
<sec>
<title>2.1.1. Brief Example</title>
<p>Using the simplicial complex in <xref ref-type="fig" rid="F1">Figure 1</xref>, we briefly illustrate some of the aforementioned concepts. Let <italic>X</italic> &#x0003D; {{<italic>a</italic>}, {<italic>b</italic>}, {<italic>c</italic>}, {<italic>a, b</italic>}, {<italic>b, c</italic>}, {<italic>a, c</italic>}, {<italic>a, b, c</italic>}} be the representation of the simplicial complex. The <italic>boundary</italic> of the triangle is non-trivial, i.e., &#x02202;<sub>2</sub>{<italic>a, b, c</italic>} &#x0003D; {<italic>b, c</italic>} &#x0002B; {<italic>a, c</italic>} &#x0002B; {<italic>a, b</italic>} The boundary of this chain of edges is trivial, though, because duplicate simplices cancel each other out. We get &#x02202;<sub>1</sub>({<italic>b, c</italic>} &#x0002B; {<italic>a, c</italic>} &#x0002B; {<italic>a, b</italic>}) &#x0003D; {<italic>c</italic>} &#x0002B; {<italic>b</italic>} &#x0002B; {<italic>c</italic>} &#x0002B; {<italic>a</italic>} &#x0002B; {<italic>b</italic>} &#x0002B; {<italic>a</italic>} &#x0003D; 0, which is consistent with the property of compatible boundary maps to compose to 0. To compute <italic>H</italic><sub>1</sub>(<italic>X</italic>): &#x0003D; <italic>Z</italic><sub>1</sub>(<italic>X</italic>)/<italic>B</italic><sub>1</sub>(<italic>X</italic>), we only have to calculate <italic>Z</italic><sub>1</sub>(<italic>X</italic>); the boundary group <italic>B</italic><sub>1</sub>(<italic>X</italic>) does not contain any non-trivial simplices because <italic>X</italic> does not contain any 2-simplices. By definition, <italic>Z</italic><sub>1</sub>(<italic>X</italic>) &#x0003D; ker(&#x02202;<sub>1</sub>) &#x0003D; span ({<italic>a, b</italic>} &#x0002B; {<italic>b, c</italic>} &#x0002B; {<italic>a, c</italic>}). This is the <italic>only</italic> cycle in <italic>X</italic> (which we can easily verify either by inspection or based on combinatorics). Hence <italic>H</italic><sub>1</sub>(<italic>X</italic>) &#x0003D; <italic>Z</italic><sub>1</sub>(<italic>X</italic>) &#x0003D; &#x1D53D;<sub>2</sub> and &#x003B2;<sub>1</sub> &#x0003D; 1; the triangle therefore exhibits a single cycle, which aligns with our intuition.</p>
</sec>
</sec>
<sec>
<title>2.2. Persistent Homology</title>
<p>Persistent homology (Edelsbrunner et al., <xref ref-type="bibr" rid="B23">2000</xref>; Zomorodian and Carlsson, <xref ref-type="bibr" rid="B65">2005</xref>) is the flagship tool of TDA. In the analysis of real-world data, it is typically not a priori clear at what <italic>scale</italic> interesting topological features occur. By using a filtration (connected to the scale parameter) persistent homology is able to capture topological changes across the whole range of scales and store this information in so-called persistence diagrams.</p>
<p><italic>Persistent homology</italic> is an extension of homology to the setting of filtered chain complexes. A <italic>filtered chain complex</italic> is a (not-necessarily strictly) ascending sequence of chain complexes <inline-formula><mml:math id="M8"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mo>&#x02282;</mml:mo><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mo>&#x02282;</mml:mo><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mo>&#x02282;</mml:mo><mml:mo>&#x02026;</mml:mo></mml:math></inline-formula> with inclusion maps <italic>&#x003B9;</italic><sup><italic>i</italic></sup> : <italic>C</italic><sup><italic>&#x003B5;</italic><sub><italic>i</italic></sub></sup><inline-graphic xlink:href="frai-04-681108-i0001.tif"/><italic>C</italic><sup><italic>&#x003B5;</italic><sub><italic>i</italic>&#x0002B;1</sub></sup> and <inline-formula><mml:math id="M10"><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msup><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msup><mml:mo>&#x025E6;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x025E6;</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mo>&#x025E6;</mml:mo></mml:math></inline-formula> <italic>&#x003B9;</italic><sup><italic>i</italic></sup> : <italic>C</italic><sup><italic>&#x003B5;</italic><sub><italic>i</italic></sub></sup><inline-graphic xlink:href="frai-04-681108-i0001.tif"/><italic>C</italic><sup><italic>&#x003B5;</italic><sub><italic>j</italic></sub></sup> for <italic>i</italic> &#x0003C; <italic>j</italic>. Filtered chain complexes naturally arise in situations where we have a sequence of inclusions of spaces <inline-formula><mml:math id="M11"><mml:msup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mo>&#x02282;</mml:mo><mml:msup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mo>&#x02282;</mml:mo><mml:msup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mo>&#x02282;</mml:mo><mml:mo>&#x02026;</mml:mo></mml:math></inline-formula>. Such cases, for instance, occur if we consider the sublevel sets <inline-formula><mml:math id="M12"><mml:msup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow></mml:msup><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0003C;</mml:mo><mml:mi>&#x003B5;</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> of a so-called <italic>filtration</italic> function <italic>f</italic> : <italic>X</italic> &#x02192; &#x0211D;, or if we consider a point cloud <italic>Y</italic> in a metric space (<italic>M</italic>, d) and set</p>
<disp-formula id="E6"><mml:math id="M13"><mml:msup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow></mml:msup><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x022C3;</mml:mo></mml:mrow><mml:mrow><mml:mi>y</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>Y</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0003C;</mml:mo><mml:mi>&#x003B5;</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></disp-formula>
<p>with filtration function <italic>g</italic> : <italic>M</italic> &#x02192; &#x0211D; given by <inline-formula><mml:math id="M14"><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mo class="qopname">inf</mml:mo></mml:mrow><mml:mrow><mml:mi>y</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>Y</mml:mi></mml:mrow></mml:msub><mml:mtext>d</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. Here <italic>B</italic><sub>&#x003B5;</sub>(<italic>y</italic>) denotes the open ball of radius &#x003B5; centred at <italic>y</italic> and we implicitly identify &#x003B5; &#x02243; &#x003B5;&#x02032; if <italic>X</italic><sup>&#x003B5;</sup> (resp. <italic>Y</italic><sup>&#x003B5;</sup>) is canonically homeomorphic to <italic>X</italic><sup>&#x003B4;</sup> (resp. <italic>Y</italic><sup>&#x003B4;</sup>) for all &#x003B4; &#x02208; [&#x003B5;, &#x003B5;&#x02032;]. An important property of (singular) homology is that it is <italic>functorial</italic> (see e.g., Bredon, <xref ref-type="bibr" rid="B6">1993</xref>), which implies that the inclusion maps &#x003B9;<sup><italic>i, j</italic></sup> induce maps on the respective homology groups <inline-formula><mml:math id="M15"><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02192;</mml:mo><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. <xref ref-type="fig" rid="F2">Figure 2</xref> depicts the Vietoris&#x02013;Rips complex construction based on a distance filtration, a standard construction in TDA. The <italic>k</italic><italic>-th persistent homology groups</italic> are the images of these inclusions, that is</p>
<disp-formula id="E7"><mml:math id="M16"><mml:msubsup><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msubsup><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>im&#x000A0;</mml:mtext><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02229;</mml:mo><mml:msub><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></disp-formula>
<p>and thus precisely consist of the <italic>k</italic>-th homology classes of <inline-formula><mml:math id="M17"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula> that still exist after taking the inclusion <inline-formula><mml:math id="M18"><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. A homology class <inline-formula><mml:math id="M19"><mml:mi>&#x003B1;</mml:mi><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> is said to be <italic>born</italic> at <inline-formula><mml:math id="M20"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula> if <inline-formula><mml:math id="M21"><mml:mi>&#x003B1;</mml:mi><mml:mo>&#x02209;</mml:mo><mml:msubsup><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, i.e., if it is not in the image of <inline-formula><mml:math id="M22"><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. If &#x003B1; is born at <inline-formula><mml:math id="M23"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula>, it is said to <italic>die</italic> at <inline-formula><mml:math id="M24"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula> if <inline-formula><mml:math id="M25"><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02209;</mml:mo><mml:msubsup><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M26"><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003B9;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:msubsup><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. The <italic>persistence</italic> of &#x003B1; is given by &#x003B5;<sub><italic>j</italic></sub> &#x02212; &#x003B5;<sub><italic>i</italic></sub> and set to infinity if it never dies. The <italic>persistent Betti-numbers</italic>, defined by <inline-formula><mml:math id="M27"><mml:msubsup><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msubsup><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mo class="qopname">dim</mml:mo><mml:msubsup><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, carry information on how the homology (and thus the topology) changes across the filtration.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Different stages of a Vietoris&#x02013;Rips filtration for a simple &#x0201C;circle&#x0201D; point cloud. From left to right, connectivity of the underlying simplicial complex increases as &#x003F5; increases.</p></caption>
<graphic xlink:href="frai-04-681108-g0002.tif"/>
</fig>
<p>This information can be captured in a so-called <italic>persistence diagram</italic>, a multiset in <inline-formula><mml:math id="M28"><mml:msup><mml:mrow><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x0222A;</mml:mo><mml:mi>&#x0211D;</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mi>&#x0221E;</mml:mi></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>. Specifically, the persistence diagram of (homological) <italic>dimension</italic> <italic>k</italic> is given by the points <inline-formula><mml:math id="M29"><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> with multiplicity</p>
<disp-formula id="E8"><mml:math id="M30"><mml:msubsup><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msubsup><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></disp-formula>
<p>for all <italic>i</italic> &#x0003C; <italic>j</italic>. The multiplicity <inline-formula><mml:math id="M31"><mml:msubsup><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> counts the number of <italic>k</italic>-th homology classes that are born at <inline-formula><mml:math id="M32"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula> and die at <inline-formula><mml:math id="M33"><mml:msup><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula>. <xref ref-type="fig" rid="F3">Figure 3</xref> depicts a simple persistence diagram, calculated from the Vietoris&#x02013;Rips complex in <xref ref-type="fig" rid="F2">Figure 2</xref>. The axes of this diagram correspond to the &#x003F5; values at which topological features are created and destroyed, respectively. The single point of high persistence corresponds to the primary topological feature of the point cloud, namely its circular shape. Other topological features occur at smaller scales&#x02014;lower values of &#x003F5;&#x02014;and hence form a small dense cluster in the lower-left corner of the persistence diagram. The persistent Betti-numbers can be recovered from the persistence diagram itself; see Edelsbrunner and Harer, <xref ref-type="bibr" rid="B22">2010</xref>.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>A persistence diagram containing 1-dimensional topological features (cycles).</p></caption>
<graphic xlink:href="frai-04-681108-g0003.tif"/>
</fig>
<p>A crucial fact that makes persistent homology valuable for application in data analysis is its <italic>stability with respect to perturbations</italic> of the filtration function. This means that persistent homology is robust to noise and constitutes an encoding of intrinsic topological properties of the data. More precisely, the space of persistence diagrams can be endowed with a metric induced by the <italic>bottleneck distance</italic> (or the <italic>Wasserstein distances</italic>) Edelsbrunner and Harer, <xref ref-type="bibr" rid="B22">2010</xref>. A celebrated stability theorem (Cohen-Steiner et al., <xref ref-type="bibr" rid="B17">2007</xref>) states that the <italic>L</italic><sub>&#x0221E;</sub>-distance of two real-valued functions <italic>f</italic> and <italic>g</italic> is an upper bound for the bottleneck distance <italic>W</italic><sub>&#x0221E;</sub> of their respective persistence diagrams <inline-formula><mml:math id="M34"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="M35"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, i.e., <inline-formula><mml:math id="M36"><mml:msub><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x0221E;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02264;</mml:mo><mml:mo>|</mml:mo><mml:mo>|</mml:mo><mml:mi>f</mml:mi><mml:mo>-</mml:mo><mml:mi>g</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x0221E;</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:math></inline-formula> The stability theorem and its variants (Skraba and Turner, <xref ref-type="bibr" rid="B55">2020</xref>) are highly relevant for applications because they imply that the behaviour of persistent homology under noise is known; descriptors such as persistence diagrams change continuously as the input function is varied, and the &#x0201C;amplitude&#x0201D; of their change is bounded from above via the stability theorem.</p>
</sec>
</sec>
<sec id="s3">
<title>3. Survey</title>
<p>This section comprises the main part of the paper, where we gather and discuss pertinent methods and tools in topological machine learning. We broadly group the methods into the following categories. First, in section 3.2, we discuss methods that deal with <italic>extrinsic topological features</italic>. By the qualification <italic>extrinsic</italic>, we mean that no analysis of the topology of the machine learning model or the neural network itself is incorporated. These methods are instead mainly concerned with enabling the use of topological features, extracted from a given data set, in downstream machine learning models. This can be achieved through <italic>vectorisation</italic> of topological features or by designing specialised layers of neural networks that are capable of handling such features. Next, section 3.3 discusses <italic>intrinsic topological features</italic>. Those are methods that incorporate the topological analysis of aspects of the machine learning model itself. Whenever applicable, we further classify methods into <italic>observational</italic> and <italic>interventional</italic> methods. This sub-classification specifies <italic>how</italic> the methods are applied in a machine learning framework. Observational methods &#x0201C;observe&#x0201D; the topology of the data or model but they do not <italic>directly</italic> influence the model training or architecture. Interventional methods, by contrast, apply topological properties of the data, as well as <italic>post-hoc</italic> analysis of topological features of machine learning models, in order to inform the architectural design and/or model training. See <xref ref-type="fig" rid="F4">Figure 4</xref> for an overview of the methods and their categories, as well as <xref ref-type="table" rid="T1">Table 1</xref> for the classification of all papers mentioned in this survey.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>This overview figure shows examples of methods discussed in the survey and their range of influence. Green (red) boxes signify <italic>observational</italic> (<italic>interventional</italic>) methods. <xref ref-type="table" rid="T1">Table 1</xref> provides a more in-depth classification of all methods.</p></caption>
<graphic xlink:href="frai-04-681108-g0004.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>The categorisation of the approaches discussed in the present survey.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>Extrinsic</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>Intrinsic</bold></th>
</tr>
<tr>
<th valign="top" align="left"><bold>Observational</bold></th>
<th valign="top" align="left"><bold>Interventional</bold></th>
<th valign="top" align="left"><bold>Observational</bold></th>
<th valign="top" align="left"><bold>Interventional</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Adams et al., <xref ref-type="bibr" rid="B1">2017</xref></td>
<td valign="top" align="left">Carri&#x000E8;re et al., <xref ref-type="bibr" rid="B9">2020</xref></td>
<td valign="top" align="left">Gabrielsson and Carlsson, <xref ref-type="bibr" rid="B25">2019</xref></td>
<td valign="top" align="left">Chen et al., <xref ref-type="bibr" rid="B15">2019</xref></td>
</tr>
<tr>
<td valign="top" align="left">Bubenik, <xref ref-type="bibr" rid="B7">2015</xref></td>
<td valign="top" align="left">Kim et al., <xref ref-type="bibr" rid="B36">2020</xref></td>
<td valign="top" align="left">Khrulkov and Oseledets, <xref ref-type="bibr" rid="B35">2018</xref></td>
<td valign="top" align="left">Hofer et al., <xref ref-type="bibr" rid="B32">2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Carri&#x000E8;re et al., <xref ref-type="bibr" rid="B11">2015</xref></td>
<td valign="top" align="left">Zhao and Wang, <xref ref-type="bibr" rid="B62">2019</xref></td>
<td valign="top" align="left">Zhou et al., <xref ref-type="bibr" rid="B64">2021</xref></td>
<td valign="top" align="left">Hofer C. et al., <xref ref-type="bibr" rid="B31">2019</xref></td>
</tr>
<tr>
<td valign="top" align="left">Carri&#x000E8;re et al., <xref ref-type="bibr" rid="B10">2017</xref></td>
<td/>
<td/>
<td valign="top" align="left">Hofer et al., <xref ref-type="bibr" rid="B29">2020a</xref></td>
</tr>
<tr>
<td valign="top" align="left">Kusano et al., <xref ref-type="bibr" rid="B39">2018</xref></td>
<td/>
<td/>
<td valign="top" align="left">Hofer et al., <xref ref-type="bibr" rid="B30">2020b</xref></td>
</tr>
<tr>
<td valign="top" align="left">Reininghaus et al., <xref ref-type="bibr" rid="B47">2015</xref></td>
<td/>
<td/>
<td valign="top" align="left">Moor et al., <xref ref-type="bibr" rid="B43">2020</xref></td>
</tr>
<tr>
<td valign="top" align="left">Rieck et al., <xref ref-type="bibr" rid="B49">2020a</xref></td>
<td/>
<td/>
<td valign="top" align="left">Ramamurthy et al., <xref ref-type="bibr" rid="B46">2019</xref></td>
</tr>
<tr>
<td valign="top" align="left">Rieck et al., <xref ref-type="bibr" rid="B51">2020b</xref></td>
<td/>
<td/>
<td valign="top" align="left">Rieck et al., <xref ref-type="bibr" rid="B50">2019b</xref></td>
</tr>
<tr>
<td valign="top" align="left">Umeda, <xref ref-type="bibr" rid="B59">2017</xref></td>
<td/>
<td/>
<td valign="top" align="left">Zhao et al., <xref ref-type="bibr" rid="B63">2020</xref></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>It is interesting to note that intrinsic features tend to be used more in interventional settings, whereas extrinsic features remain observational for the most part</italic>.</p>
</table-wrap-foot>
</table-wrap>
<sec>
<title>3.1. Limitations</title>
<p>Our paper selection is a cross-section over major machine learning conferences and machine learning journals. We refrain from comparing methods on certain tasks&#x02014;such as classification&#x02014;because there is considerable heterogeneity in the experimental setup, precluding a <italic>fair</italic> assessment of such methods.</p>
</sec>
<sec>
<title>3.2. Extrinsic Topological Features in Machine Learning</title>
<p>This section gives an overview of methods that aim at suitably representing topological features in order to use them as input features for machine learning models. We will refer to this class of methods as <italic>extrinsic topological features in machine learning</italic>, as they take topological information of the data sets into account, as opposed to intrinsic topological information of the machine learning framework itself (see section 3.3). A large class of such methods is comprised of <italic>vectorisation</italic> methods, that aim to transform persistent homology information into a feature vector form in order to make use of it in machine learning models. However, alternative representations of topological descriptors, such as kernels or function-based representations, are also discussed in this section.</p>
<sec>
<title>3.2.1. Vector-Based and Function-Based Representations</title>
<p>Persistence diagrams (see section 2) constitute useful descriptors of homological information of data. However, being multisets, they cannot be used <italic>directly</italic> as input data for machine learning models in the usual sense (recent paradigm shifts in machine learning, namely the introduction of <italic>deep sets</italic> (Zaheer et al., <xref ref-type="bibr" rid="B61">2017</xref>), challenge this assumption somewhat, as we will later see in section 3.2.3). One first needs to suitably represent&#x02014;or <italic>vectorise</italic>&#x02014;persistence diagrams (PDs) in order to use them for downstream machine learning tasks. There are two predominant strategies for facilitating the integration of topological features into machine learning algorithms, namely (i) different representations that ideally give rise to feature vectors, and (ii) kernel-based methods that permit the integration into certain classifiers. Notice that these two strategies are not necessarily exclusionary; some representations, for example, also give rise to a kernel-based method.</p>
<p>Representations and kernel-based methods should ideally be efficiently computable, satisfy similar stability properties as the persistence diagrams themselves&#x02014;hence exhibiting robustness properties with respect to noise&#x02014;as well as provide some interpretable features. The stability of such representations is based on the fundamental stability theorem by Cohen-Steiner et al. (<xref ref-type="bibr" rid="B17">2007</xref>). In recent years, a multitude of suitable representation methods have been introduced; we present a selection thereof, focusing on representations that have already been used in machine learning contexts. As a somewhat broad categorisation, we observe that persistence diagrams are often mapped into an auxiliary <italic>vector space</italic>, e.g., by discretisation (Anirudh et al., <xref ref-type="bibr" rid="B2">2016</xref>; Adams et al., <xref ref-type="bibr" rid="B1">2017</xref>), or by mapping into a (Banach- or Hilbert-) <italic>function space</italic> (Chazal et al., <xref ref-type="bibr" rid="B13">2014</xref>; Bubenik, <xref ref-type="bibr" rid="B7">2015</xref>; Di Fabio and Ferri, <xref ref-type="bibr" rid="B21">2015</xref>). Alternatively, there are several <italic>kernel methods</italic> (Reininghaus et al., <xref ref-type="bibr" rid="B47">2015</xref>; Carri&#x000E8;re et al., <xref ref-type="bibr" rid="B10">2017</xref>; Kusano et al., <xref ref-type="bibr" rid="B39">2018</xref>) that enable the efficient calculation of a similarity measure between persistence diagrams. Representations and kernel-based methods fall into the category of what we denote &#x0201C;observational&#x0201D; methods. The only exception is given by PersLay (Carri&#x000E8;re et al., <xref ref-type="bibr" rid="B9">2020</xref>), which informs the layers of the model and thus is an &#x0201C;interventional&#x0201D; method.</p>
<p>Arguably the most simple form of employing topological descriptors in machine learning tasks uses <italic>summary statistics</italic>, such as the total persistence of a persistence diagram (Cohen-Steiner et al., <xref ref-type="bibr" rid="B19">2010</xref>), its <italic>p</italic>-norm (Chen and Edelsbrunner, <xref ref-type="bibr" rid="B14">2011</xref>), or its persistent entropy (Atienza et al., <xref ref-type="bibr" rid="B3">2019</xref>), i.e., the Shannon entropy of the individual persistence values in a diagram. While all of these approaches result in scalar-valued summary statistics, they are often not directly applicable to complex machine learning tasks, which require more expressive representations. We note, however, that such statistics give rise to hypothesis testing (Blumberg et al., <xref ref-type="bibr" rid="B4">2014</xref>) based on topological information and we envision that this field will become more prominent as topological features find their use case for data analysis. A simple and stable representation of persistence diagrams, suitable for machine learning tasks, is provided by what are commonly called <italic>Betti curves</italic>. Given a persistence diagram <inline-formula><mml:math id="M37"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:math></inline-formula>, and a weight function <italic>w</italic> : &#x0211D;<sup>2</sup> &#x02192; &#x0211D;, its Betti curve is the function &#x003B2; : &#x0211D; &#x02192; &#x0211D; defined by</p>
<disp-formula id="E9"><label>(1)</label><mml:math id="M38"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>&#x003B2;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>,</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:mrow></mml:munder></mml:mstyle><mml:mi>w</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>,</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x000B7;</mml:mo><mml:msub><mml:mrow><mml:mn>&#x1D7D9;</mml:mn></mml:mrow><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>,</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where</p>
<disp-formula id="E10"><label>(2)</label><mml:math id="M39"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mn>&#x1D7D9;</mml:mn></mml:mrow><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>,</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mn>1</mml:mn><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mtext class="textrm" mathvariant="normal">if&#x000A0;</mml:mtext><mml:mi>t</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>,</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mtext class="textrm" mathvariant="normal">else</mml:mtext></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>is the indicator function. The Betti curve was often informally used to analyse data (Umeda, <xref ref-type="bibr" rid="B59">2017</xref>); recently, Rieck et al. (<xref ref-type="bibr" rid="B49">2020a</xref>) provided a summarising description of their features. <xref ref-type="fig" rid="F5">Figure 5</xref> depicts a simple illustration of the calculation of Betti curves. Betti curves are advantageous because they permit the calculation of a <italic>mean</italic> curve, next to providing an easy-to-evaluate distance and kernel method. Chevyrev et al. (<xref ref-type="bibr" rid="B16">2018</xref>) used this representation&#x02014;and related &#x0201C;paths&#x0201D; derived from a persistence diagram and its representations&#x02014;to solve classification tasks, using random forests and support vector machine classifiers. One drawback of the Betti curves is their limited expressive power. Being a summary statistic of a persistence diagram, the mapping from a diagram to a curve is not injective; moreover, the curve only contains <italic>counts</italic> of topological features and does not permit tracking single features, for example.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>A persistence diagram <bold>(A)</bold>, its persistence barcode <bold>(B)</bold>, and its corresponding Betti curve <bold>(C)</bold>. Notice that the <italic>interpretation</italic> of the axes of different plots is different, hence we exclude labels for the barcode representation.</p></caption>
<graphic xlink:href="frai-04-681108-g0005.tif"/>
</fig>
<p>A more fundamental technique, developed by Carri&#x000E8;re et al. (<xref ref-type="bibr" rid="B11">2015</xref>), <italic>directly</italic> generates a high-dimensional feature vector from a persistence diagram. The main idea is to obtain a vector representation of some persistence diagram <inline-formula><mml:math id="M40"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:math></inline-formula> based on the distribution of pairwise distances of its elements, including points on the diagonal &#x00394;: &#x0003D; {(<italic>x, x</italic>)|<italic>x</italic> &#x02208; &#x0211D;} &#x02282; &#x0211D;<sup>2</sup>. More precisely, for each pair (<italic>p, q</italic>) of points in <inline-formula><mml:math id="M41"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:math></inline-formula>, they compute <italic>m</italic>(<italic>p, q</italic>): &#x0003D; min{<italic>d</italic><sub>&#x0221E;</sub>(<italic>p, q</italic>), <italic>d</italic><sub>&#x0221E;</sub>(<italic>p</italic>, &#x00394;), <italic>d</italic><sub>&#x0221E;</sub>(<italic>q</italic>, &#x00394;)} and associate to <inline-formula><mml:math id="M42"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:math></inline-formula> the vector of these values, sorted in descending order. As persistence diagrams may be of different sizes, they enlarge each of these vectors by zeros so that its length matches the length of the longest vector in the set. Hence, the set of persistence diagrams one considers needs to be fixed a priori. This vectorisation does not necessarily scale well to large data sets, but it can provide a good baseline to furnish <italic>any</italic> machine learning classifier&#x02014;including a neural network&#x02014;with simple topology-based feature vectors. The use of this technique appears to be restricted at present; we hope that our article will help increase its adoption.</p>
<p>As a somewhat more complicated, but also more expressive, representation, Bubenik (<xref ref-type="bibr" rid="B7">2015</xref>) introduced topological descriptors called <italic>persistence landscapes</italic> that map persistence diagrams into a (Banach or Hilbert) function space in an invertible manner that satisfies stability properties with respect to the bottleneck distance of PDs. The <italic>persistence landscape</italic> &#x003BB; : &#x02115; &#x000D7; &#x0211D; &#x02192; &#x0211D; of a PD <inline-formula><mml:math id="M43"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> can be defined in the following way. For <italic>b</italic> &#x0003C; <italic>d</italic>, we consider the auxiliary function <italic>f</italic><sub>(<italic>b, d</italic>)</sub>(<italic>t</italic>): &#x0003D; max{0, min{<italic>t</italic> &#x02212; <italic>b, d</italic> &#x02212; <italic>t</italic>}} and define the persistence landscape as</p>
<disp-formula id="E11"><mml:math id="M44"><mml:mo>&#x003BB;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>kmax</mml:mtext><mml:msub><mml:mrow><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:math></disp-formula>
<p>where kmax denotes the <italic>k</italic>-th largest element of the set. In addition to injectivity and stability, persistence landscapes do not require any choice of auxiliary parameters in their construction (see <xref ref-type="fig" rid="F6">Figure 6</xref> for a depiction of the persistence landscape computation process). They also afford various summary statistics, such as a norm calculation as well the calculation of both a kernel and a distance measure, making them a versatile representation of topological features. While, persistence landscapes have seen applications in time series analysis (Stolz et al., <xref ref-type="bibr" rid="B56">2017</xref>), their most successful integration into machine learning algorithms is provided in the form of a new <italic>layer</italic>: persistence landscapes form the basis of a robust (with respect to noise) topological layer for deep neural networks, which is differentiable with respect to its inputs, the so-called PLLay (persistence landscape based topological layer) established in Kim et al. (<xref ref-type="bibr" rid="B36">2020</xref>). This layer exhibits good performance in image classification tasks as well as orbit classification, where it is shown to provide new state-of-the-art performance. We note that persistence landscapes are often considered in a vectorised form, which is obtained through binning their domain. While this is possible and useful for certain applications, we want to stress that the persistence landscape, as a lossless representation, should ideally be treated as such. The calculation of persistence landscapes imposes additional computational complexity, but the empirical performance reported by Kim et al. (<xref ref-type="bibr" rid="B36">2020</xref>) suggests that the landscapes are well-suited as a feature descriptor.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Computing a <italic>persistence landscape</italic> involves calculating the &#x0201C;area of influence&#x0201D; of each topological feature in a persistence diagram. Each connected shaded region with at least <italic>k</italic> intersections forms the basis of the <italic>k</italic>-th persistence landscape, which can be obtained by &#x0201C;peeling off&#x0201D; layers in an iterative fashion.</p></caption>
<graphic xlink:href="frai-04-681108-g0006.tif"/>
</fig>
<p>The <italic>persistence images</italic> (PIs), introduced by Adams et al. (<xref ref-type="bibr" rid="B1">2017</xref>), constitute an elegant hierarchical vectorisation step, representing a PD as a vector through the following steps. First the PD <inline-formula><mml:math id="M45"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:math></inline-formula> is transformed from &#x0201C;birth&#x02013;death&#x0201D;-coordinates into &#x0201C;birth&#x02013;persistence&#x0201D;-coordinates via the transformation</p>
<disp-formula id="E12"><mml:math id="M46"><mml:mi>T</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x02192;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x021A6;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>-</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:math></disp-formula>
<p>Next, for each <italic>u</italic> &#x02208; &#x0211D;<sup>2</sup> a differentiable probability density &#x003D5;<sub><italic>u</italic></sub> on &#x0211D;<sup>2</sup> is chosen (the standard choice being a normalised symmetric Gaussian with &#x1D53C;[&#x003D5;<sub><italic>u</italic></sub>] &#x0003D; <italic>u</italic>), as well as a weighting function <inline-formula><mml:math id="M47"><mml:mi>f</mml:mi><mml:mo>:</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x02192;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x02265;</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> satisfying <italic>f</italic>|<sub>{0} &#x000D7; &#x0211D;</sub> &#x02261; 0. Additionally one chooses a discretisation of a relevant subdomain of &#x0211D;<sup>2</sup> by a standard grid. Each region <italic>R</italic> of this grid then corresponds to a pixel in the persistence image with value given by</p>
<disp-formula id="E13"><mml:math id="M48"><mml:mstyle displaystyle="true"><mml:msub><mml:mrow><mml:mo>&#x0222B;</mml:mo></mml:mrow><mml:mrow><mml:mi>R</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>u</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>T</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:munder></mml:mstyle><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>d</mml:mtext><mml:mi>z</mml:mi><mml:mo>.</mml:mo></mml:math></disp-formula>
<p>In the process of generating persistence images, there are three non-canonical choices to be made. First, the choice of the weighting function, which is often chosen to emphasise features in the PD with large persistence value, next the distributions &#x003D5;<sub><italic>u</italic></sub>, and lastly the resolution of the discretisation grid. Adams et al. (<xref ref-type="bibr" rid="B1">2017</xref>) prove that PIs are stable with respect to the 1-Wasserstein distance between persistence diagrams. <xref ref-type="fig" rid="F7">Figure 7</xref> illustrates their calculation. Persistence images are highly flexible and are often employed to make a classifier &#x0201C;topology-aware&#x0201D; to some extent (Zhao and Wang, <xref ref-type="bibr" rid="B62">2019</xref>; Carri&#x000E8;re and Blumberg, <xref ref-type="bibr" rid="B8">2020</xref>; Rieck et al., <xref ref-type="bibr" rid="B51">2020b</xref>). A paper by Zhao and Wang (<xref ref-type="bibr" rid="B62">2019</xref>), for instance, showcases their utility for graph classification. Interestingly, this paper constitutes also one of the few interventional approaches that employ extrinsic topological features; specifically, the authors use pre-defined filtrations to obtain graph-based persistence diagrams, and learn task-based weights for individual &#x0201C;pixels&#x0201D; (or &#x0201C;cells&#x0201D;) in the diagram. This approach is seen to surpass several graph classification algorithms on standard benchmark data sets&#x02014;a remarkable feat, considering that the method does not employ any label information. The main drawbacks of persistence images are their quadratic storage and computation complexity, as well as the choice of appropriate parameters. While recent work found them to be remarkably stable in practice with respect to the Gaussian kernel parameters (Rieck et al., <xref ref-type="bibr" rid="B51">2020b</xref>), there are no guidelines for picking such hyperparameters, necessitating a (cross-validated) grid search, for instance.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>A persistence image arises as a discretisation of the density function (with appropriate weights) supported on a persistence diagram. It permits the calculation of an increasingly better-resolved sequence of images, which may be directly used as feature vectors.</p></caption>
<graphic xlink:href="frai-04-681108-g0007.tif"/>
</fig>
</sec>
<sec>
<title>3.2.2. Kernel-Based Representations</title>
<p>As an alternative to the previously-discussed representations, we now want to briefly focus on persistence diagrams again. The space of persistence diagrams can be endowed with metrics, such as the bottleneck distance. However, there is no natural Hilbert space structure on it, and such metrics tend to be computationally prohibitive or require the use of complex approximation algorithms (Kerber et al., <xref ref-type="bibr" rid="B34">2017</xref>). Kernel methods provide a way of implicitly introducing such a Hilbert space structure to which persistence diagrams can be mapped via the feature map of the kernel. This then allows for a downstream use in machine learning models. To be more specific, given a set <italic>X</italic>, a function <italic>k</italic> : <italic>X</italic> &#x000D7; <italic>X</italic> &#x02192; &#x0211D; is called a (positive definite) <italic>kernel</italic> if there exists a Hilbert space <inline-formula><mml:math id="M49"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">H</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> together with a <italic>feature map</italic> <inline-formula><mml:math id="M50"><mml:mi>&#x003D5;</mml:mi><mml:mo>:</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x02192;</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">H</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> such that <inline-formula><mml:math id="M51"><mml:mi>k</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mo>&#x02329;</mml:mo><mml:mrow><mml:mi>&#x003D5;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x003D5;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x0232A;</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">H</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub></mml:math></inline-formula> for all <italic>x</italic><sub>1</sub>, <italic>x</italic><sub>2</sub> &#x02208; <italic>X</italic>. Thus, by defining a kernel on the set of persistence diagrams, one obtains a vector representation via the feature map. However, in order for such a kernel to be useful in practice, it should additionally preserve the metric stability properties of persistence diagrams. Some pertinent examples of the kernel method are the following. Reininghaus et al. (<xref ref-type="bibr" rid="B47">2015</xref>) define a kernel on the set of persistence diagrams that is stable with respect to the 1-Wasserstein distance (Villani, <xref ref-type="bibr" rid="B60">2009</xref>). The kernel is based on the idea of heat diffusion on a persistence diagram and offers a feature map that can be discretised (in fact, there are interesting similarities to persistence images). It was subsequently shown to satisfy <italic>universality</italic> (Kwitt et al., <xref ref-type="bibr" rid="B40">2015</xref>), a desirable property for a kernel to have because it implies suitability for hypothesis testing. The <italic>sliced Wasserstein kernel</italic>, which is metric-preserving, was introduced by Carri&#x000E8;re et al. (<xref ref-type="bibr" rid="B10">2017</xref>). It is based on the idea of the sliced Wasserstein distance (Kolouri et al., <xref ref-type="bibr" rid="B37">2016</xref>), which ensures positive definiteness of the kernel through low-dimensional projections. Kusano et al. (<xref ref-type="bibr" rid="B39">2018</xref>) propose <italic>persistence weighted Gaussian kernels</italic> that incorporate a weighting and satisfy stability results with respect to the bottleneck distance and the 1-Wasserstein distance. The expressive power of kernels is in contrast to their computational complexity. Na&#x000EF;ve implementations scale quadratically in the number of points, thus impeding the use of kernels for persistence diagrams with a large number of points. Some mitigation strategies exist (Greengard and Strain, <xref ref-type="bibr" rid="B27">1991</xref>; Rahimi and Recht, <xref ref-type="bibr" rid="B45">2008</xref>), but have not been adopted by implementations so far (moreover, their use is not always applicable, necessitating additional research). Nevertheless, such kernels are attractive because they are <italic>not</italic> limited with respect to the input data. Most of the papers exhibit good performance for shape classification or segmentation tasks, as well as in orbit classification.</p>
<p>While most of the aforementioned kernels are used to directly compare persistence diagrams, there are also examples of kernels <italic>based on</italic> topological information. An interesting example is provided by Rieck et al. (<xref ref-type="bibr" rid="B48">2019a</xref>), who introduce the Persistent Weisfeiler&#x02013;Lehman (P-WL) kernel for graphs. It computes topological features during a Weisfeiler&#x02013;Lehman (WL) procedure. The WL procedure refers to an iterative scheme in which vertex label information is aggregated over the neighbours of each vertex, resulting in a label multiset. A perfect hashing scheme is now applied to every multiset and the graph is relabelled with the ensuing hashes. This process can be repeated until a pre-defined limit has been reached or until the labels do not change any more. While originally intended as a test for graph isomorphism, it turns out that there are non-isomorphic graphs that cannot be distinguished by the WL procedure. However, it turns out to be an exceptionally useful way of assessing the dissimilarity between two graphs in polynomial time, leading to the WL kernel framework (Shervashidze and Borgwardt, <xref ref-type="bibr" rid="B53">2009</xref>; Shervashidze et al., <xref ref-type="bibr" rid="B54">2011</xref>), which enjoys great popularity for graph learning tasks (Borgwardt et al., <xref ref-type="bibr" rid="B5">2020</xref>; Kriege et al., <xref ref-type="bibr" rid="B38">2020</xref>). The P-WL extension of WL is characterised by its capability to extract topological information of the graph with respect to the current node labelling for each WL iteration. This kernel is particularly notable since it constitutes the first (to our knowledge) method that imbues data-based labels into the calculation of persistent homology.</p>
</sec>
<sec>
<title>3.2.3. Integrating Topological Descriptors Into Neural Networks</title>
<p>One of the seminal methods that built a bridge between modern machine learning techniques and TDA is a work by Hofer et al. (<xref ref-type="bibr" rid="B32">2017</xref>). Using a differentiable projection function for persistence diagrams (with learnable parameters), the authors demonstrate that persistence diagrams of a data set can be easily integrated into <italic>any</italic> deep learning architecture. While the primary focus of the paper lies on developing such a projection function, the authors demonstrate the general feasibility of topological descriptors in both shape and graph classification tasks. A follow-up publication (Hofer C. D. et al., <xref ref-type="bibr" rid="B33">2019</xref>) discusses more theoretical requirements for learning representations of topological descriptors.</p>
<p>This approach, as well as the development of the &#x0201C;DeepSets&#x0201D; architecture (Zaheer et al., <xref ref-type="bibr" rid="B61">2017</xref>), which makes deep learning methods capable of learning <italic>sets</italic>, i.e., unordered sequences of varying cardinalities, spurred the development of <italic>layers</italic> that can be easily integrated into a deep learning workflow. An excellent example of such a layer is Carri&#x000E8;re et al. (<xref ref-type="bibr" rid="B9">2020</xref>), which employs extended persistence (Cohen-Steiner et al., <xref ref-type="bibr" rid="B18">2009</xref>) and heat kernel signatures to <italic>learn</italic> a vectorisation of persistence diagrams suited to the learning task at hand. PersLay is a neural network layer, defined by</p>
<disp-formula id="E14"><mml:math id="M52"><mml:mtext>PersLay</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>op</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mi>w</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x000B7;</mml:mo><mml:mi>&#x003D5;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M53"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">D</mml:mi></mml:mrow></mml:math></inline-formula> is a persistence diagram, op is and permutation invariant mapping, <italic>w</italic> : &#x0211D;<sup>2</sup> &#x02192; &#x0211D; is a weight function and &#x003D5; : &#x0211D;<sup>2</sup> &#x02192; &#x0211D;<sup><italic>d</italic></sup> is a vector representation function. Its generic definition allows PersLay to subsume and recover many existing representations by appropriate choices of op and &#x003D5; (Carri&#x000E8;re et al., <xref ref-type="bibr" rid="B9">2020</xref>).</p>
</sec>
</sec>
<sec>
<title>3.3. Intrinsic Topological Features in Machine Learning</title>
<p>This section reviews methods that either incorporate topological information directly into the design of a machine learning model itself, or leverage topology to study aspects of such a model. We refer to such features as <italic>intrinsic topological features</italic>. The primary examples are regularisation techniques as well as techniques for analysing neural network architectures.</p>
<sec>
<title>3.3.1. Regularisation Techniques</title>
<p>As a recent example, Moor et al. (<xref ref-type="bibr" rid="B43">2020</xref>) propose a topological autoencoder, which aims to preserve topological features of the input data in low-dimensional representations. This is achieved via a regularisation term that incentivises the persistence diagrams of both the latent and input space to be topologically similar. This method acts on the level of mini-batches, treating each of them as a point cloud. Persistence diagrams are obtained from the Vietoris&#x02013;Rips complex of each space. By tracking the simplices that are relevant for the creation and destruction of topological features, and by consistently mapping simplices to a given edge in the Vietoris&#x02013;Rips complex, each filtration can be interpreted as a selection of distances from the full distance matrix of the point cloud. The proposed regularisation term then compares the &#x0201C;selected&#x0201D; distances in the data space with the corresponding distances in the latent space (and vice versa). Finally, this regularisation is differentiable under the assumption that the persistence diagram is discrete (i.e., for each of its points, there is an infinitesimal neighbourhood containing no other points). The scheme can thus be directly integrated into the end-to-end training of an autoencoder, making it aware of the topology in the data space. This work can also be considered as an extension of previous work by Hofer C. et al. (<xref ref-type="bibr" rid="B31">2019</xref>), who introduced a differentiable loss term for one-class learning that controls the topology of the latent space; in effect, their loss term enforces a preferred &#x0201C;scale&#x0201D; for topological features in the latent space. It does not have to harmonise topological features <italic>across</italic> different spaces. It turns out that an autoencoder trained with this loss term on unlabelled data can be used on other data sets for one-class learning. This hints at the fact that enforcing a certain topological structure can be beneficial for learning tasks; we will later see that such empirical observations can also be furnished with a theoretical underpinning.</p>
<p>An approach by Chen et al. (<xref ref-type="bibr" rid="B15">2019</xref>) takes a different perspective. The authors develop a measure of the <italic>topological complexity</italic> (in terms of connected components) of the classification boundary of a given classifier. Said topological information is then used for regularisation in order to force the topological complexity of the decision boundary to be simpler, containing fewer features of low persistence. Thus, topological information serves as a penalty during classification such that training the classifier itself can be improved. In contrast to the aforementioned approach, differentiability is obtained through a &#x0201C;surrogate&#x0201D; piecewise linear approximation of the classifier. The method is seen to yield competitive results and the authors observe that the method performs well even in the presence of label noise. Analysing the decision boundary of a classifier also turns out to be advantageous for <italic>model selection</italic>, as we will later see in section 3.3.2.</p>
<p>Hofer et al. (<xref ref-type="bibr" rid="B29">2020a</xref>) analyse more fundamental principles of regularisation by means of topological features. Specifically, they study regularisation in a regime of small sample sizes with over-parametrised neural networks. By developing a new topological constraint for per-class probability measures, mass concentration effects in the vicinity of the learned representations of training instances are observed, leading to overall improvements of generalisation performance. The authors observe that controlling topological properties of learned representations presents numerous avenues for future research. These theoretical findings validate the empirical improvements observed in previous works of this domain.</p>
<p>As a more involved example of methods that make use of intrinsic features, Zhao et al. (<xref ref-type="bibr" rid="B63">2020</xref>) include topological features of graph neighbourhoods into a standard graph neural network (GNN) architecture. Their method combines a shortest-path filtration with persistence images, which are subsequently compressed to a single scalar value using a multilayer perceptron. The resulting scalar is then used to re-weight the message passing scheme used in training the GNN, thus obtaining topologically-based representations of graph neighbourhoods. In contrast to the previously-described loss terms, this method is not end-to-end differentiable, though, because the conversion from persistence diagrams to persistence images involves non-continuous parameters, i.e., the image dimensions. Zhao et al. (<xref ref-type="bibr" rid="B63">2020</xref>) primarily propose this method for node classification tasks, but we hypothesise that other graph tasks would profit from the integration of topological features.</p>
<p>Last, to provide a somewhat complementary perspective to preceding work, a paper by Hofer et al. (<xref ref-type="bibr" rid="B30">2020b</xref>) discusses how to employ graph neural networks (GNNs) to <italic>learn</italic> an appropriate filtration in an end-to-end fashion. The authors demonstrate that a GNN can be used to successfully initialise a scalar-valued filtration function, which can then subsequently be trained under mild assumptions (specifically, injectivity at the vertices of the graph needs to hold). The learned filtration turns out to surpass fixed filtrations combined with a persistent homology baseline, thus demonstrating the benefits of making topological representations differentiable&#x02014;and thus <italic>trainable</italic>.</p>
</sec>
<sec>
<title>3.3.2. Model Analysis</title>
<p>Shifting our view from regularisation techniques, topological analysis has been applied to evaluate generative adversarial networks (GANs). A GAN (Goodfellow et al., <xref ref-type="bibr" rid="B26">2014</xref>) is comprised of two sub-networks, a generator and a discriminator. Given a data distribution <italic>P</italic><sub>data</sub>, the generators objective is to learn a distribution <italic>P</italic><sub>model</sub> with the same statistics, whereas the discriminator learns to distinguish generated samples from actual data samples. The topological evaluation of GANs is motivated by the manifold hypothesis (Fefferman et al., <xref ref-type="bibr" rid="B24">2013</xref>), which poses that a data distribution <italic>P</italic><sub>data</sub> is sampled from an underlying manifold <inline-formula><mml:math id="M54"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">data</mml:mtext></mml:mstyle></mml:mrow></mml:msub></mml:math></inline-formula>. The idea is to assess the topological similarity of <inline-formula><mml:math id="M55"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">data</mml:mtext></mml:mstyle></mml:mrow></mml:msub></mml:math></inline-formula> and the underlying manifold <inline-formula><mml:math id="M56"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">model</mml:mtext></mml:mstyle></mml:mrow></mml:msub></mml:math></inline-formula> of the model generated distribution <italic>P</italic><sub>model</sub>. Based on the persistent homology of witness complexes, Khrulkov and Oseledets (<xref ref-type="bibr" rid="B35">2018</xref>) introduce the <italic>Geometry Score</italic>, which is a similarity measure of the topologies of <inline-formula><mml:math id="M57"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">data</mml:mtext></mml:mstyle></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="M58"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">model</mml:mtext></mml:mstyle></mml:mrow></mml:msub></mml:math></inline-formula> and can be used to evaluate generative models. Later work by Zhou et al. (<xref ref-type="bibr" rid="B64">2021</xref>) generalises this approach and additionally extends it to the disentanglement evaluation of generative models in unsupervised settings.</p>
<p>In a different direction, the topological analysis of the intrinsic structure of a classifier, such as a neural network, makes it possible to improve a variety of tasks. This includes the analysis of training behaviour as well as model selection&#x02014;or <italic>architecture selection</italic> in the case of neural networks.</p>
<p>While the literature dedicated to the better understanding of deep neural networks has typically focused on its functional properties, Rieck et al. (<xref ref-type="bibr" rid="B50">2019b</xref>) took a different perspective to focus on the graph structure of a neural network. Specifically, they treat a (feed-forward) neural network as a stack of bipartite graphs. From this view, they propose &#x0201C;neural persistence,&#x0201D; a complexity measure which summarizes topological features that arise when calculating a filtration of the neural network graph where the filtration weights are given by the network parameters. They showed that neural persistence can distinguish between well-trained and badly-trained (i.e., diverged) networks. This measure is oblivious to the functional behaviour of the underlying network, but only focuses on its (weighted) <italic>structure</italic>. Nevertheless, Rieck et al. (<xref ref-type="bibr" rid="B50">2019b</xref>) showed that it can be used for guiding early stopping solely based on topological properties of the neural network, potentially saving validation data used for the early stopping decision.</p>
<p>Ramamurthy et al. (<xref ref-type="bibr" rid="B46">2019</xref>) employ labelled variants of simplicial complexes, such as a labelled Vietoris&#x02013;Rips complex, to analyse the decision boundary (i.e., classification boundary) of a given classifier. The authors are able to provide theoretical guarantees that the correct homology of a decision boundary can be recovered from samples, thus paving the way for an efficient approximation scheme that incorporates local scale estimates of the data set. Such a construction is required because the density of available samples is not guaranteed to be uniform, leading to simplicial complexes with spurious simplices in high-density regions, while running the risk of &#x0201C;undersampling&#x0201D; low-density regions. Next to &#x0201C;matching&#x0201D; models based on the <italic>Decision Boundary Topological Complexity</italic> (DBTC) score, Ramamurthy et al. (<xref ref-type="bibr" rid="B46">2019</xref>) also enable matching data sets to pre-trained models. The underlying assumption is that a model that closely mimics the topological complexity of a data set is presumably a better candidate for this particular data set.</p>
<p>Gabrielsson and Carlsson (<xref ref-type="bibr" rid="B25">2019</xref>) utilise topological data analysis to analyse topological information encoded in the weights of convolutional neural networks (CNNs). They show that the weights of convolutional layers encode simple global structures which dynamically change during training of the network and correlate with the network&#x00027;s ability to generalise to unseen data. Moreover, they find that topological information on the trained weights of a network can lead to improvements in training efficiency and reflect the generality of the data set on which the training was performed.</p>
</sec>
</sec>
</sec>
<sec id="s4">
<title>4. Outlook and Challenges</title>
<p>This survey provided a glimpse of the nascent field of <italic>topological machine learning</italic>. We categorised existing work depending on its intention (interventional vs. observational) and according to what type of topological features are being calculated (extrinsic vs. intrinsic), finding that most extrinsic approaches are observational, i.e., they do not inform the choice of model afterwards, while most intrinsic approaches are interventional, i.e., they result in changes to the choice of model or its architecture.</p>
<p>Numerous avenues for future research exist. Of the utmost importance is the improvement of the &#x0201C;software ecosystem.&#x0201D; Software libraries such as <monospace>GUDHI</monospace> (Maria et al., <xref ref-type="bibr" rid="B42">2014</xref>) and <monospace>giotto-tda</monospace> (Tauzin et al., <xref ref-type="bibr" rid="B58">2021</xref>) are vital ingredients for increasing the adoption of TDA methods, but we envision that there is a specific niche for libraries that integrate <italic>directly</italic> with machine learning frameworks such as <monospace>pytorch</monospace>. This will make it easier to disseminate knowledge and inspire more research. A challenge that the community yet has to overcome involves the overall scalability of methods, though. While certain improvements on the level of filtrations are being made (Sheehy, <xref ref-type="bibr" rid="B52">2013</xref>; Cavanna et al., <xref ref-type="bibr" rid="B12">2015</xref>), those improvements have yet to be integrated into existing algorithms. A more fundamental question is to what extent TDA has to rely on &#x0201C;isotropic&#x0201D; complexes such as the Vietoris&#x02013;Rips complex, and whether scale-dependent complexes that incorporate sparsity can be developed.</p>
<p>On the side of applications, we note that several papers already target problems such as graph classification, but they are primarily based on fixed filtrations (with the notable exception of Hofer et al. (<xref ref-type="bibr" rid="B30">2020b</xref>), who learn a filtration end-to-end). We envision that future work could target more involved scenarios, such as the creation of &#x0201C;hybrid&#x0201D; GNNs, and the use of end-to-end differentiable features for other graph tasks, such as node classification, link prediction, or community detection.</p>
<p>As another upcoming topic, we think that the analysis of time-varying data sets using topology-based methods is long overdue. With initial work by Cohen-Steiner et al. (<xref ref-type="bibr" rid="B20">2006</xref>) on time-varying topological descriptors providing a theoretical foundation, there are nevertheless few topology-based approaches that address time series classification or time series analysis. Several&#x02014;theoretical and practical&#x02014;aspects for such an endeavour are addressed by Perea et al. (<xref ref-type="bibr" rid="B44">2015</xref>), who develop a persistence-based method for quantifying periodicity in time series. The method is based on the fundamental embedding theorem by Takens (<xref ref-type="bibr" rid="B57">1981</xref>) and is combined with a sliding window approach. Future work could build on such approaches, or find other ways to characterise time-series, for instance based on complex networks (Lacasa et al., <xref ref-type="bibr" rid="B41">2008</xref>). This could pave the road toward novel applications of TDA such as anomaly detection.</p>
</sec>
<sec id="s5">
<title>Author Contributions</title>
<p>FH, MM, and BR performed the literature search and revised the draft. FH and BR drafted the original manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Adams</surname> <given-names>H.</given-names></name> <name><surname>Emerson</surname> <given-names>T.</given-names></name> <name><surname>Kirby</surname> <given-names>M.</given-names></name> <name><surname>Neville</surname> <given-names>R.</given-names></name> <name><surname>Peterson</surname> <given-names>C.</given-names></name> <name><surname>Shipman</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Persistence images: a stable vector representation of persistent homology</article-title>. <source>J. Mach. Learn. Res</source>. <volume>18</volume>, <fpage>1</fpage>&#x02013;<lpage>35</lpage>.</citation></ref>
<ref id="B2">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Anirudh</surname> <given-names>R.</given-names></name> <name><surname>Venkataraman</surname> <given-names>V.</given-names></name> <name><surname>Ramamurthy</surname> <given-names>K. N.</given-names></name> <name><surname>Turaga</surname> <given-names>P.</given-names></name></person-group> (<year>2016</year>). <article-title>A Riemannian framework for statistical analysis of topological persistence diagrams</article-title>, in <source>2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</source>, <fpage>1023</fpage>-<lpage>1031</lpage>. <pub-id pub-id-type="doi">10.1109/CVPRW.2016.132</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Atienza</surname> <given-names>N.</given-names></name> <name><surname>Gonzalez-Diaz</surname> <given-names>R.</given-names></name> <name><surname>Rucco</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>Persistent entropy for separating topological features from noise in vietoris-rips complexes</article-title>. <source>J. Intell. Inform. Syst</source>. <volume>52</volume>, <fpage>637</fpage>&#x02013;<lpage>655</lpage>. <pub-id pub-id-type="doi">10.1007/s10844-017-0473-4</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blumberg</surname> <given-names>A. J.</given-names></name> <name><surname>Gal</surname> <given-names>I.</given-names></name> <name><surname>Mandell</surname> <given-names>M. A.</given-names></name> <name><surname>Pancia</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Robust statistics, hypothesis testing, and confidence intervals for persistent homology on metric measure spaces</article-title>. <source>Found. Comput. Math</source>. <volume>14</volume>, <fpage>745</fpage>&#x02013;<lpage>789</lpage>. <pub-id pub-id-type="doi">10.1007/s10208-014-9201-4</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Borgwardt</surname> <given-names>K.</given-names></name> <name><surname>Ghisu</surname> <given-names>E.</given-names></name> <name><surname>Llinares-Lopez</surname> <given-names>F.</given-names></name> <name><surname>O&#x00027;Bray</surname> <given-names>L.</given-names></name> <name><surname>Rieck</surname> <given-names>B.</given-names></name></person-group> (<year>2020</year>). <article-title>Graph kernels: state-of-the-art and future challenges</article-title>. <source>Found. Trends Mach. Learn</source>. <volume>13</volume>, <fpage>531</fpage>&#x02013;<lpage>712</lpage>. <pub-id pub-id-type="doi">10.1561/2200000076</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bredon</surname> <given-names>G. E.</given-names></name></person-group> (<year>1993</year>). <source>Topology and Geometry, Volume 139 of Graduate Texts in Mathematics. New York, NY: Springer-Verlag</source>. <pub-id pub-id-type="doi">10.1007/978-1-4757-6848-0</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bubenik</surname> <given-names>P.</given-names></name></person-group> (<year>2015</year>). <article-title>Statistical topological data analysis using persistence landscapes</article-title>. <source>J. Mach. Learn. Res</source>. <volume>16</volume>, <fpage>77</fpage>&#x02013;<lpage>102</lpage>.</citation></ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Carri&#x000E8;re</surname> <given-names>M.</given-names></name> <name><surname>Blumberg</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>Multiparameter persistence image for topological machine learning</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 33</volume>, eds <person-group person-group-type="editor"><name><surname>Larochelle</surname> <given-names>H.</given-names></name> <name><surname>Ranzato</surname> <given-names>M.</given-names></name> <name><surname>Hadsell</surname> <given-names>R.</given-names></name> <name><surname>Balcan</surname> <given-names>M. F.</given-names></name> <name><surname>Lin</surname> <given-names>H.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>), <fpage>22432</fpage>&#x02013;<lpage>22444</lpage>.</citation></ref>
<ref id="B9">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Carri&#x000E8;re</surname> <given-names>M.</given-names></name> <name><surname>Chazal</surname> <given-names>F.</given-names></name> <name><surname>Ike</surname> <given-names>Y.</given-names></name> <name><surname>Lacombe</surname> <given-names>T.</given-names></name> <name><surname>Royer</surname> <given-names>M.</given-names></name> <name><surname>Umeda</surname> <given-names>Y.</given-names></name></person-group> (<year>2020</year>). <article-title>PersLay: a neural network layer for persistence diagrams and new graph topological signatures</article-title>, in <source>Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics</source>, eds <person-group person-group-type="editor"><name><surname>Chiappa</surname> <given-names>S.</given-names></name> <name><surname>Calandra</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>2786</fpage>&#x02013;<lpage>2796</lpage>.</citation></ref>
<ref id="B10">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Carri&#x000E8;re</surname> <given-names>M.</given-names></name> <name><surname>Cuturi</surname> <given-names>M.</given-names></name> <name><surname>Oudot</surname> <given-names>S.</given-names></name></person-group> (<year>2017</year>). <article-title>Sliced Wasserstein kernel for persistence diagrams</article-title>, in <source>Proceedings of the 34th International Conference on Machine Learning</source> eds <person-group person-group-type="editor"><name><surname>Precup</surname> <given-names>D.</given-names></name> <name><surname>The</surname> <given-names>Y. W.</given-names></name></person-group> (<publisher-loc>Sydney, NSW</publisher-loc>: <publisher-name>International Convention Centre; PMLR</publisher-name>), <fpage>664</fpage>&#x02013;<lpage>673</lpage>.</citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Carri&#x000E8;re</surname> <given-names>M.</given-names></name> <name><surname>Oudot</surname> <given-names>S.</given-names></name> <name><surname>Ovsjanikov</surname> <given-names>M.</given-names></name></person-group> (<year>2015</year>). <article-title>Stable topological signatures for points on 3D shapes</article-title>. <source>Comput. Graph. Forum</source> <volume>34</volume>, <fpage>1</fpage>&#x02013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1111/cgf.12692</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cavanna</surname> <given-names>N. J.</given-names></name> <name><surname>Jahanseir</surname> <given-names>M.</given-names></name> <name><surname>Sheehy</surname> <given-names>D. R.</given-names></name></person-group> (<year>2015</year>). <article-title><italic>A geometric perspective on sparse filtrations</italic>,</article-title> in <source>Proceedings of the Canadian Conference on Computational Geometry</source> (<publisher-loc>Kingston, ON</publisher-loc>).</citation></ref>
<ref id="B13">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chazal</surname> <given-names>F.</given-names></name> <name><surname>Fasy</surname> <given-names>B. T.</given-names></name> <name><surname>Lecci</surname> <given-names>F.</given-names></name> <name><surname>Rinaldo</surname> <given-names>A.</given-names></name> <name><surname>Wasserman</surname> <given-names>L.</given-names></name></person-group> (<year>2014</year>). <article-title>Stochastic convergence of persistence landscapes and silhouettes</article-title>, in <source>Proceedings of the Thirtieth Annual Symposium on Computational Geometry, SOCG&#x00027;14</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>474</fpage>&#x02013;<lpage>483</lpage>. <pub-id pub-id-type="doi">10.1145/2582112.2582128</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>C.</given-names></name> <name><surname>Edelsbrunner</surname> <given-names>H.</given-names></name></person-group> (<year>2011</year>). <article-title>Diffusion runs low on persistence fast</article-title>, in <source>Proceedings of the IEEE International Conference on Computer Vision (ICCV)</source> (<publisher-loc>Red Hook, NY</publisher-loc>: <publisher-name>Curran Associates, Inc.</publisher-name>), <fpage>423</fpage>&#x02013;<lpage>430</lpage>. <pub-id pub-id-type="doi">10.1109/ICCV.2011.6126271</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>C.</given-names></name> <name><surname>Ni</surname> <given-names>X.</given-names></name> <name><surname>Bai</surname> <given-names>Q.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name></person-group> (<year>2019</year>). <article-title>A topological regularizer for classifiers via persistent homology</article-title>, in <source>Proceedings of Machine Learning Research</source>, eds <person-group person-group-type="editor"><name><surname>Chaudhuri</surname> <given-names>K.</given-names></name> <name><surname>Sugiyama</surname> <given-names>M.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>2573</fpage>&#x02013;<lpage>2582</lpage>.</citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chevyrev</surname> <given-names>I.</given-names></name> <name><surname>Nanda</surname> <given-names>V.</given-names></name> <name><surname>Oberhauser</surname> <given-names>H.</given-names></name></person-group> (<year>2018</year>). <article-title>Persistence paths and signature features in topological data analysis</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>. <volume>42</volume>, <fpage>192</fpage>&#x02013;<lpage>202</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2018.2885516</pub-id><pub-id pub-id-type="pmid">30530312</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen-Steiner</surname> <given-names>D.</given-names></name> <name><surname>Edelsbrunner</surname> <given-names>H.</given-names></name> <name><surname>Harer</surname> <given-names>J.</given-names></name></person-group> (<year>2007</year>). <article-title>Stability of persistence diagrams</article-title>. <source>Discrete Comput. Geom</source>. <volume>37</volume>, <fpage>103</fpage>&#x02013;<lpage>120</lpage>. <pub-id pub-id-type="doi">10.1007/s00454-006-1276-5</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen-Steiner</surname> <given-names>D.</given-names></name> <name><surname>Edelsbrunner</surname> <given-names>H.</given-names></name> <name><surname>Harer</surname> <given-names>J.</given-names></name></person-group> (<year>2009</year>). <article-title>Extending persistence using Poincar&#x000E9; and Lefschetz duality</article-title>. <source>Found. Comput. Math</source>. <volume>9</volume>, <fpage>79</fpage>&#x02013;<lpage>103</lpage>. <pub-id pub-id-type="doi">10.1007/s10208-008-9027-z</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen-Steiner</surname> <given-names>D.</given-names></name> <name><surname>Edelsbrunner</surname> <given-names>H.</given-names></name> <name><surname>Harer</surname> <given-names>J.</given-names></name> <name><surname>Mileyko</surname> <given-names>Y.</given-names></name></person-group> (<year>2010</year>). <article-title>Lipschitz functions have L<sub><italic>p</italic></sub>-stable persistence</article-title>. <source>Found. Comput. Math</source>. <volume>10</volume>, <fpage>127</fpage>&#x02013;<lpage>139</lpage>. <pub-id pub-id-type="doi">10.1007/s10208-010-9060-6</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cohen-Steiner</surname> <given-names>D.</given-names></name> <name><surname>Edelsbrunner</surname> <given-names>H.</given-names></name> <name><surname>Morozov</surname> <given-names>D.</given-names></name></person-group> (<year>2006</year>). <article-title>Vines and vineyards by updating persistence in linear time</article-title>, in <source>Proceedings of the Twenty-Second Annual Symposium on Computational Geometry, SCG &#x00027;06</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>119</fpage>&#x02013;<lpage>126</lpage>. <pub-id pub-id-type="doi">10.1145/1137856.1137877</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Di Fabio</surname> <given-names>B.</given-names></name> <name><surname>Ferri</surname> <given-names>M.</given-names></name></person-group> (<year>2015</year>). <article-title>Comparing persistence diagrams through complex vectors</article-title>, in <source>Image Analysis and Processing</source> &#x02013; <italic>ICIAP 2015</italic>, eds <person-group person-group-type="editor"><name><surname>Murino</surname> <given-names>V.</given-names></name> <name><surname>Puppo</surname> <given-names>E.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>294</fpage>&#x02013;<lpage>305</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-23231-7_27</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Edelsbrunner</surname> <given-names>H.</given-names></name> <name><surname>Harer</surname> <given-names>J.</given-names></name></person-group> (<year>2010</year>). <source>Computational Topology: An Introduction. Providence, RI: American Mathematical Society</source>. <pub-id pub-id-type="doi">10.1090/mbk/069</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Edelsbrunner</surname> <given-names>H.</given-names></name> <name><surname>Letscher</surname> <given-names>D.</given-names></name> <name><surname>Zomorodian</surname> <given-names>A.</given-names></name></person-group> (<year>2000</year>). <article-title>Topological persistence and simplification</article-title>, in <source>Proceedings 41st Annual Symposium on Foundations of Computer Science</source> (<publisher-loc>Redondo Beach, CA</publisher-loc>), <fpage>454</fpage>&#x02013;<lpage>463</lpage>. <pub-id pub-id-type="doi">10.1109/SFCS.2000.892133</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fefferman</surname> <given-names>C.</given-names></name> <name><surname>Mitter</surname> <given-names>S.</given-names></name> <name><surname>Narayanan</surname> <given-names>H.</given-names></name></person-group> (<year>2013</year>). <article-title>Testing the manifold hypothesis</article-title>. <source>J. Am. Math. Soc</source>. <volume>29</volume>, <fpage>983</fpage>&#x02013;<lpage>1049</lpage>. <pub-id pub-id-type="doi">10.1090/jams/852</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gabrielsson</surname> <given-names>R. B.</given-names></name> <name><surname>Carlsson</surname> <given-names>G.</given-names></name></person-group> (<year>2019</year>). <article-title>Exposition and interpretation of the topology of neural networks</article-title>, in <source>2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)</source>, <fpage>1069</fpage>&#x02013;<lpage>1076</lpage>. <pub-id pub-id-type="doi">10.1109/ICMLA.2019.00180</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Goodfellow</surname> <given-names>I.</given-names></name> <name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name> <name><surname>Mirza</surname> <given-names>M.</given-names></name> <name><surname>Xu</surname> <given-names>B.</given-names></name> <name><surname>Warde-Farley</surname> <given-names>D.</given-names></name> <name><surname>Ozair</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Generative adversarial nets</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 27</volume>, eds <person-group person-group-type="editor"><name><surname>Ghahramani</surname> <given-names>Z.</given-names></name> <name><surname>Welling</surname> <given-names>M.</given-names></name> <name><surname>Cortes</surname> <given-names>C.</given-names></name> <name><surname>Lawrence</surname> <given-names>N.</given-names></name> <name><surname>Weinberger</surname> <given-names>K. Q.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>).</citation></ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Greengard</surname> <given-names>L.</given-names></name> <name><surname>Strain</surname> <given-names>J.</given-names></name></person-group> (<year>1991</year>). <article-title>The fast Gauss transform</article-title>. <source>SIAM J. Sci. Stat. Comput</source>. <volume>12</volume>, <fpage>79</fpage>&#x02013;<lpage>94</lpage>. <pub-id pub-id-type="doi">10.1137/0912004</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hatcher</surname> <given-names>A.</given-names></name></person-group> (<year>2000</year>). <source>Algebraic Topology</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation></ref>
<ref id="B29">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hofer</surname> <given-names>C.</given-names></name> <name><surname>Graf</surname> <given-names>F.</given-names></name> <name><surname>Niethammer</surname> <given-names>M.</given-names></name> <name><surname>Kwitt</surname> <given-names>R.</given-names></name></person-group> (<year>2020a</year>). <article-title>Topologically densified distributions</article-title>, in <source>Proceedings of the 37th International Conference on Machine Learning</source>, eds <person-group person-group-type="editor"><name><surname>Daum&#x000E9;</surname> <given-names>H.</given-names> <suffix>III</suffix></name> <name><surname>Singh</surname> <given-names>A.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>4304</fpage>&#x02013;<lpage>4313</lpage>.</citation></ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hofer</surname> <given-names>C.</given-names></name> <name><surname>Graf</surname> <given-names>F.</given-names></name> <name><surname>Rieck</surname> <given-names>B.</given-names></name> <name><surname>Niethammer</surname> <given-names>M.</given-names></name> <name><surname>Kwitt</surname> <given-names>R.</given-names></name></person-group> (<year>2020b</year>). <article-title>Graph filtration learning</article-title>, in <source>Proceedings of the 37th International Conference on Machine Learning</source>, eds <person-group person-group-type="editor"><name><surname>Daum&#x000E9;</surname> <given-names>H.</given-names> <suffix>III</suffix></name> <name><surname>Singh</surname> <given-names>A.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>4314</fpage>&#x02013;<lpage>4323</lpage>.</citation></ref>
<ref id="B31">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hofer</surname> <given-names>C.</given-names></name> <name><surname>Kwitt</surname> <given-names>R.</given-names></name> <name><surname>Niethammer</surname> <given-names>M.</given-names></name> <name><surname>Dixit</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>Connectivity-optimized representation learning via persistent homology</article-title>, in <source>Proceedings of the 36th International Conference on Machine Learning</source>, eds <person-group person-group-type="editor"><name><surname>Chaudhuri</surname> <given-names>K.</given-names></name> <name><surname>Salakhutdinov</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>2751</fpage>&#x02013;<lpage>2760</lpage>.</citation></ref>
<ref id="B32">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hofer</surname> <given-names>C.</given-names></name> <name><surname>Kwitt</surname> <given-names>R.</given-names></name> <name><surname>Niethammer</surname> <given-names>M.</given-names></name> <name><surname>Uhl</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>Deep learning with topological signatures</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 30</volume>, eds <person-group person-group-type="editor"><name><surname>Guyon</surname> <given-names>I.</given-names></name> <name><surname>Luxburg</surname> <given-names>U. V.</given-names></name> <name><surname>Bengio</surname> <given-names>S.</given-names></name> <name><surname>Wallach</surname> <given-names>H.</given-names></name> <name><surname>Fergus</surname> <given-names>R.</given-names></name> <name><surname>Vishwanathan</surname> <given-names>S.</given-names></name> <name><surname>Garnett</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>).</citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hofer</surname> <given-names>C. D.</given-names></name> <name><surname>Kwitt</surname> <given-names>R.</given-names></name> <name><surname>Niethammer</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>Learning representations of persistence barcodes</article-title>. <source>J. Mach. Learn. Res</source>. <volume>20</volume>, <fpage>1</fpage>&#x02013;<lpage>45</lpage>.</citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kerber</surname> <given-names>M.</given-names></name> <name><surname>Morozov</surname> <given-names>D.</given-names></name> <name><surname>Nigmetov</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>Geometry helps to compare persistence diagrams</article-title>. <source>ACM J. Exp. Algorith</source>. <fpage>22</fpage>. <pub-id pub-id-type="doi">10.1145/3064175</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Khrulkov</surname> <given-names>V.</given-names></name> <name><surname>Oseledets</surname> <given-names>I.</given-names></name></person-group> (<year>2018</year>). <article-title>Geometry score: a method for comparing generative adversarial networks</article-title>, in <source>Proceedings of the 35th International Conference on Machine Learning</source>, eds <person-group person-group-type="editor"><name><surname>Dy</surname> <given-names>J.</given-names></name> <name><surname>Krause</surname> <given-names>A.</given-names></name></person-group> (<publisher-loc>Stockholm</publisher-loc>: <publisher-name>PMLR</publisher-name>), <fpage>2621</fpage>&#x02013;<lpage>2629</lpage>.</citation></ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>K.</given-names></name> <name><surname>Kim</surname> <given-names>J.</given-names></name> <name><surname>Zaheer</surname> <given-names>M.</given-names></name> <name><surname>Kim</surname> <given-names>J.</given-names></name> <name><surname>Chazal</surname> <given-names>F.</given-names></name> <name><surname>Wasserman</surname> <given-names>L.</given-names></name></person-group> (<year>2020</year>). <article-title>PLLay: efficient topological layer based on persistent landscapes</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 33</volume>, eds <person-group person-group-type="editor"><name><surname>Larochelle</surname> <given-names>H.</given-names></name> <name><surname>Ranzato</surname> <given-names>M.</given-names></name> <name><surname>Hadsell</surname> <given-names>R.</given-names></name> <name><surname>Balcan</surname> <given-names>M. F.</given-names></name> <name><surname>Lin</surname> <given-names>H.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>), <fpage>15965</fpage>&#x02013;<lpage>15977</lpage>.</citation></ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kolouri</surname> <given-names>S.</given-names></name> <name><surname>Zou</surname> <given-names>Y.</given-names></name> <name><surname>Rohde</surname> <given-names>G. K.</given-names></name></person-group> (<year>2016</year>). <article-title>Sliced Wasserstein kernels for probability distributions</article-title>, in <source>IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>, <fpage>5258</fpage>&#x02013;<lpage>5267</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2016.568</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kriege</surname> <given-names>N. M.</given-names></name> <name><surname>Johansson</surname> <given-names>F. D.</given-names></name> <name><surname>Morris</surname> <given-names>C.</given-names></name></person-group> (<year>2020</year>). <article-title>A survey on graph kernels</article-title>. <source>Appl. Netw. Sci</source>. <volume>5</volume>:<fpage>6</fpage>. <pub-id pub-id-type="doi">10.1007/s41109-019-0195-3</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kusano</surname> <given-names>G.</given-names></name> <name><surname>Fukumizu</surname> <given-names>K.</given-names></name> <name><surname>Hiraoka</surname> <given-names>Y.</given-names></name></person-group> (<year>2018</year>). <article-title>Kernel method for persistence diagrams via kernel embedding and weight factor</article-title>. <source>J. Mach. Learn. Res</source>. <volume>18</volume>, <fpage>1</fpage>&#x02013;<lpage>41</lpage>.</citation></ref>
<ref id="B40">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kwitt</surname> <given-names>R.</given-names></name> <name><surname>Huber</surname> <given-names>S.</given-names></name> <name><surname>Niethammer</surname> <given-names>M.</given-names></name> <name><surname>Lin</surname> <given-names>W.</given-names></name> <name><surname>Bauer</surname> <given-names>U.</given-names></name></person-group> (<year>2015</year>). <article-title>Statistical topological data analysis&#x02014;a kernel perspective</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 28</volume>, eds <person-group person-group-type="editor"><name><surname>Cortes</surname> <given-names>C.</given-names></name> <name><surname>Lawrence</surname> <given-names>N.</given-names></name> <name><surname>Lee</surname> <given-names>D.</given-names></name> <name><surname>Sugiyama</surname> <given-names>M.</given-names></name> <name><surname>Garnett</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>).</citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lacasa</surname> <given-names>L.</given-names></name> <name><surname>Luque</surname> <given-names>B.</given-names></name> <name><surname>Ballesteros</surname> <given-names>F.</given-names></name> <name><surname>Luque</surname> <given-names>J.</given-names></name> <name><surname>Nuno</surname> <given-names>J. C.</given-names></name></person-group> (<year>2008</year>). <article-title>From time series to complex networks: the visibility graph</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>105</volume>, <fpage>4972</fpage>&#x02013;<lpage>4975</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0709247105</pub-id><pub-id pub-id-type="pmid">18362361</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Maria</surname> <given-names>C.</given-names></name> <name><surname>Boissonnat</surname> <given-names>J.-D.</given-names></name> <name><surname>Glisse</surname> <given-names>M.</given-names></name> <name><surname>Yvinec</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>The GUDHI library: simplicial complexes and persistent homology</article-title>, in <source>Mathematical Software-ICMS 2014</source>, eds <person-group person-group-type="editor"><name><surname>Hong</surname> <given-names>H.</given-names></name> <name><surname>Yap</surname> <given-names>C.</given-names></name></person-group> (<publisher-loc>Berlin; Heidelberg</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>167</fpage>&#x02013;<lpage>174</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-662-44199-2_28</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Moor</surname> <given-names>M.</given-names></name> <name><surname>Horn</surname> <given-names>M.</given-names></name> <name><surname>Rieck</surname> <given-names>B.</given-names></name> <name><surname>Borgwardt</surname> <given-names>K.</given-names></name></person-group> (<year>2020</year>). <article-title>Topological autoencoders</article-title>, in <source>Proceedings of the 37th International Conference on Machine Learning</source>, eds <person-group person-group-type="editor"><name><surname>Daum&#x000E9;</surname> <given-names>H.</given-names> <suffix>III</suffix></name> <name><surname>Singh</surname> <given-names>A.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>7045</fpage>&#x02013;<lpage>7054</lpage>.</citation></ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perea</surname> <given-names>J.</given-names></name> <name><surname>Deckard</surname> <given-names>A.</given-names></name> <name><surname>Haase</surname> <given-names>S.</given-names></name> <name><surname>Harer</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>SW1PerS: sliding windows and 1-persistence scoring; discovering periodicity in gene expression time series data</article-title>. <source>BMC Bioinformatics</source> <volume>16</volume>:<fpage>257</fpage>. <pub-id pub-id-type="doi">10.1186/s12859-015-0645-6</pub-id><pub-id pub-id-type="pmid">26277424</pub-id></citation></ref>
<ref id="B45">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rahimi</surname> <given-names>A.</given-names></name> <name><surname>Recht</surname> <given-names>B.</given-names></name></person-group> (<year>2008</year>). <article-title>Random features for large-scale kernel machines</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 20</volume>, eds <person-group person-group-type="editor"><name><surname>Platt</surname> <given-names>J.</given-names></name> <name><surname>Koller</surname> <given-names>D.</given-names></name> <name><surname>Singer</surname> <given-names>Y.</given-names></name> <name><surname>Roweis</surname> <given-names>S.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>).</citation></ref>
<ref id="B46">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ramamurthy</surname> <given-names>K. N.</given-names></name> <name><surname>Varshney</surname> <given-names>K.</given-names></name> <name><surname>Mody</surname> <given-names>K.</given-names></name></person-group> (<year>2019</year>). <article-title>Topological data analysis of decision boundaries with application to model selection</article-title>, in <source>Proceedings of the 36th International Conference on Machine Learning</source>, eds <person-group person-group-type="editor"><name><surname>Chaudhuri</surname> <given-names>K.</given-names></name> <name><surname>Salakhutdinov</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>5351</fpage>&#x02013;<lpage>5360</lpage>.</citation></ref>
<ref id="B47">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Reininghaus</surname> <given-names>J.</given-names></name> <name><surname>Huber</surname> <given-names>S.</given-names></name> <name><surname>Bauer</surname> <given-names>U.</given-names></name> <name><surname>Kwitt</surname> <given-names>R.</given-names></name></person-group> (<year>2015</year>). <article-title>A stable multi-scale kernel for topological machine learning</article-title>, in <source>2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>, <fpage>4741</fpage>&#x02013;<lpage>4748</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2015.7299106</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rieck</surname> <given-names>B.</given-names></name> <name><surname>Bock</surname> <given-names>C.</given-names></name> <name><surname>Borgwardt</surname> <given-names>K.</given-names></name></person-group> (<year>2019a</year>). <article-title>A persistent Weisfeiler-Lehman procedure for graph classification</article-title>, in <source>International Conference on Machine Learning</source>, eds <person-group person-group-type="editor"><name><surname>Chaudhuri</surname> <given-names>K.</given-names></name> <name><surname>Salakhutdinov</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>5448</fpage>&#x02013;<lpage>5458</lpage>.</citation></ref>
<ref id="B49">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rieck</surname> <given-names>B.</given-names></name> <name><surname>Sadlo</surname> <given-names>F.</given-names></name> <name><surname>Leitte</surname> <given-names>H.</given-names></name></person-group> (<year>2020a</year>). <article-title>Topological machine learning with persistence indicator functions</article-title>, in <source>Topological Methods in Data Analysis and Visualization V</source>, eds <person-group person-group-type="editor"><name><surname>Carr</surname> <given-names>H.</given-names></name> <name><surname>Fujishiro</surname> <given-names>I.</given-names></name> <name><surname>Sadlo</surname> <given-names>F.</given-names></name> <name><surname>Takahashi</surname> <given-names>S.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>87</fpage>&#x02013;<lpage>101</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-43036-8_6</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rieck</surname> <given-names>B.</given-names></name> <name><surname>Togninalli</surname> <given-names>M.</given-names></name> <name><surname>Bock</surname> <given-names>C.</given-names></name> <name><surname>Moor</surname> <given-names>M.</given-names></name> <name><surname>Horn</surname> <given-names>M.</given-names></name> <name><surname>Gumbsch</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2019b</year>). <article-title><italic>Neural</italic> persistence: a complexity measure for deep neural networks using algebraic topology</article-title>, in <source>International Conference on Learning Representations</source>.</citation></ref>
<ref id="B51">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rieck</surname> <given-names>B.</given-names></name> <name><surname>Yates</surname> <given-names>T.</given-names></name> <name><surname>Bock</surname> <given-names>C.</given-names></name> <name><surname>Borgwardt</surname> <given-names>K.</given-names></name> <name><surname>Wolf</surname> <given-names>G.</given-names></name> <name><surname>Turk-Browne</surname> <given-names>N.</given-names></name> <etal/></person-group>. (<year>2020b</year>). <article-title>Uncovering the topology of time-varying fMRI data using cubical persistence</article-title>, in <source>Advances in Neural Information Processing Systems (NeurIPS)</source>, <volume>Vol. 33</volume>, eds <person-group person-group-type="editor"><name><surname>Larochelle</surname> <given-names>H.</given-names></name> <name><surname>Ranzato</surname> <given-names>M.</given-names></name> <name><surname>Hadsell</surname> <given-names>R.</given-names></name> <name><surname>Balcan</surname> <given-names>M. F.</given-names></name> <name><surname>Lin</surname> <given-names>H.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>), <fpage>6900</fpage>&#x02013;<lpage>6912</lpage>.</citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sheehy</surname> <given-names>D. R.</given-names></name></person-group> (<year>2013</year>). <article-title>Linear-size approximations to the Vietoris-Rips filtration</article-title>. <source>Discrete Comput. Geom</source>. <volume>49</volume>, <fpage>778</fpage>&#x02013;<lpage>796</lpage>. <pub-id pub-id-type="doi">10.1007/s00454-013-9513-1</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Shervashidze</surname> <given-names>N.</given-names></name> <name><surname>Borgwardt</surname> <given-names>K.</given-names></name></person-group> (<year>2009</year>). <article-title>Fast subtree kernels on graphs</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 22</volume>, eds <person-group person-group-type="editor"><name><surname>Bengio</surname> <given-names>Y.</given-names></name> <name><surname>Schuurmans</surname> <given-names>D.</given-names></name> <name><surname>Lafferty</surname> <given-names>J.</given-names></name> <name><surname>Williams</surname> <given-names>C.</given-names></name> <name><surname>Culotta</surname> <given-names>A.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>), <fpage>1660</fpage>&#x02013;<lpage>1668</lpage>.</citation></ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shervashidze</surname> <given-names>N.</given-names></name> <name><surname>Schweitzer</surname> <given-names>P.</given-names></name> <name><surname>van Leeuwen</surname> <given-names>E. J.</given-names></name> <name><surname>Mehlhorn</surname> <given-names>K.</given-names></name> <name><surname>Borgwardt</surname> <given-names>K. M.</given-names></name></person-group> (<year>2011</year>). <article-title>Weisfeiler-Lehman graph kernels</article-title>. <source>J. Mach. Learn. Res</source>. <volume>12</volume>, <fpage>2539</fpage>&#x02013;<lpage>2561</lpage>.</citation></ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Skraba</surname> <given-names>P.</given-names></name> <name><surname>Turner</surname> <given-names>K.</given-names></name></person-group> (<year>2020</year>). <article-title>Wasserstein stability for persistence diagrams</article-title>. <source>arXiv preprint arXiv:2006.16824</source>.</citation></ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stolz</surname> <given-names>B. J.</given-names></name> <name><surname>Harrington</surname> <given-names>H. A.</given-names></name> <name><surname>Porter</surname> <given-names>M. A.</given-names></name></person-group> (<year>2017</year>). <article-title>Persistent homology of time-dependent functional networks constructed from coupled time series</article-title>. <source>Chaos</source> <volume>27</volume>:<fpage>047410</fpage>. <pub-id pub-id-type="doi">10.1063/1.4978997</pub-id><pub-id pub-id-type="pmid">28456167</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Takens</surname> <given-names>F.</given-names></name></person-group> (<year>1981</year>). <article-title>Detecting strange attractors in turbulence</article-title>, in <source>Dynamical systems and turbulence, Warwick 1980 (Coventry, 1979/1980)</source>, eds <person-group person-group-type="editor"><name><surname>Rand</surname> <given-names>D.</given-names></name> <name><surname>Young</surname> <given-names>L. S.</given-names></name></person-group> (<publisher-loc>Berlin; New York, NY</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>366</fpage>&#x02013;<lpage>381</lpage>. <pub-id pub-id-type="doi">10.1007/BFb0091924</pub-id></citation></ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tauzin</surname> <given-names>G.</given-names></name> <name><surname>Lupo</surname> <given-names>U.</given-names></name> <name><surname>Tunstall</surname> <given-names>L.</given-names></name> <name><surname>Perez</surname> <given-names>J. B.</given-names></name> <name><surname>Caorsi</surname> <given-names>M.</given-names></name> <name><surname>Medina-Mardones</surname> <given-names>A. M.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>giotto-tda: a topological data analysis toolkit for machine learning and data exploration</article-title>. <source>J. Mach. Learn. Res</source>. <volume>22</volume>, <fpage>1</fpage>&#x02013;<lpage>6</lpage>.</citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Umeda</surname> <given-names>Y.</given-names></name></person-group> (<year>2017</year>). <article-title>Time series classification via topological data analysis</article-title>. <source>Trans. Jpn. Soc. Artif. Intell</source>. <volume>32</volume>, <fpage>1</fpage>&#x02013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1527/tjsai.D-G72</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Villani</surname> <given-names>C.</given-names></name></person-group> (<year>2009</year>). <source>Optimal Transport, Volume 338 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences</source>]. <publisher-loc>Berlin</publisher-loc>: <publisher-name>Springer-Verlag</publisher-name>. <pub-id pub-id-type="doi">10.1007/978-3-540-71050-9</pub-id></citation></ref>
<ref id="B61">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zaheer</surname> <given-names>M.</given-names></name> <name><surname>Kottur</surname> <given-names>S.</given-names></name> <name><surname>Ravanbakhsh</surname> <given-names>S.</given-names></name> <name><surname>Poczos</surname> <given-names>B.</given-names></name> <name><surname>Salakhutdinov</surname> <given-names>R. R.</given-names></name> <name><surname>Smola</surname> <given-names>A. J.</given-names></name></person-group> (<year>2017</year>). <article-title>Deep sets</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 30</volume>, eds <person-group person-group-type="editor"><name><surname>Guyon</surname> <given-names>I.</given-names></name> <name><surname>Luxburg</surname> <given-names>U. V.</given-names></name> <name><surname>Bengio</surname> <given-names>S.</given-names></name> <name><surname>Wallach</surname> <given-names>H.</given-names></name> <name><surname>Fergus</surname> <given-names>R.</given-names></name> <name><surname>Vishwanathan</surname> <given-names>S.</given-names></name> <name><surname>Garnett</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>).</citation></ref>
<ref id="B62">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>Q.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name></person-group> (<year>2019</year>). <article-title>Learning metrics for persistence-based summaries and applications for graph classification</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 32</volume>, eds <person-group person-group-type="editor"><name><surname>Wallach</surname> <given-names>H.</given-names></name> <name><surname>Larochelle</surname> <given-names>H.</given-names></name> <name><surname>Beygelzimer</surname> <given-names>A,</given-names></name> <name><surname>Florence d&#x00027;Alch&#x000E9;-Buc</surname></name> <name><surname>Fox</surname> <given-names>E.</given-names></name> <name><surname>Garnett</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>Curran Associates, Inc.</publisher-name>)</citation></ref>
<ref id="B63">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>Q.</given-names></name> <name><surname>Ye</surname> <given-names>Z.</given-names></name> <name><surname>Chen</surname> <given-names>C.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name></person-group> (<year>2020</year>). <article-title>Persistence enhanced graph neural network</article-title>, in <source>Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics</source>, eds <person-group person-group-type="editor"><name><surname>Chiappa</surname> <given-names>S.</given-names></name> <name><surname>Calandra</surname> <given-names>R.</given-names></name></person-group> (<publisher-name>PMLR</publisher-name>), <fpage>2896</fpage>&#x02013;<lpage>2906</lpage>.</citation></ref>
<ref id="B64">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>S.</given-names></name> <name><surname>Zelikman</surname> <given-names>E.</given-names></name> <name><surname>Lu</surname> <given-names>F.</given-names></name> <name><surname>Ng</surname> <given-names>A. Y.</given-names></name> <name><surname>Carlsson</surname> <given-names>G. E.</given-names></name> <name><surname>Ermon</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>Evaluating the disentanglement of deep generative models through manifold topology</article-title>, in <source>International Conference on Learning Representations</source>.</citation></ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zomorodian</surname> <given-names>A.</given-names></name> <name><surname>Carlsson</surname> <given-names>G.</given-names></name></person-group> (<year>2005</year>). <article-title>Computing persistent homology</article-title>. <source>Discrete Comput. Geom</source>. <volume>33</volume>, <fpage>249</fpage>&#x02013;<lpage>274</lpage>. <pub-id pub-id-type="doi">10.1007/s00454-004-1146-y</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn fn-type="financial-disclosure"><p><bold>Funding.</bold> This work was partially funded and supported by the Swiss National Science Foundation [Spark grant 190466, FH and BR]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</p></fn>
</fn-group>
</back>
</article>
