Topology Applied to Machine Learning: From Global to Local

Through the use of examples, we explain one way in which applied topology has evolved since the birth of persistent homology in the early 2000s. The first applications of topology to data emphasized the global shape of a dataset, such as the three-circle model for 3 × 3 pixel patches from natural images, or the configuration space of the cyclo-octane molecule, which is a sphere with a Klein bottle attached via two circles of singularity. In these studies of global shape, short persistent homology bars are disregarded as sampling noise. More recently, however, persistent homology has been used to address questions about the local geometry of data. For instance, how can local geometry be vectorized for use in machine learning problems? Persistent homology and its vectorization methods, including persistence landscapes and persistence images, provide popular techniques for incorporating both local geometry and global topology into machine learning. Our meta-hypothesis is that the short bars are as important as the long bars for many machine learning tasks. In defense of this claim, we survey applications of persistent homology to shape recognition, agent-based modeling, materials science, archaeology, and biology. Additionally, we survey work connecting persistent homology to geometric features of spaces, including curvature and fractal dimension, and various methods that have been used to incorporate persistent homology into machine learning.


Introduction
Applied topology is designed to measure the shape of data -but what is shape? Early examples in applied topology found low-dimensional structures in high-dimensional datasets, such as the three circle and Klein bottle models for greyscale natural image patches. These models are global: they parameterize the entire dataset, in the sense that most of the data points look like some point in the model, plus noise. In more recent applications, however, the shape that is being measured is not global, but instead local. Local features include texture, small-scale geometry, and the structure of noise.
Indeed, for the first decade after the invention of persistent homology, the primary story was that significant features in a dataset corresponded to long bars in the persistence barcode, whereas shorter bars generally corresponded to sampling noise. This story has evolved as applied topology has become incorporated into the machine learning pipeline. In machine learning applications, many researchers have independently found that the short bars are often the most discriminating -the shape of the noise, or of the local geometry, is what often enables high classification accuracy. We want to emphasize that short bars do matter. Indeed, the short bars in persistent homology are currently one of the best out-of-the-box methods for summarizing local geometry for use in machine learning. Though humans are not trained to interpret short persistent homology bars (there may even be too many short bars for the human eye to count), machine learning algorithms can be trained to do so. In this way, persistent homology has greatly expanded in scope during the second decade after its invention: persistent homology has important applications as a descriptor not only of global shape, but also of local geometry.
In this perspective article, we begin by outlining some of the most famous early applications of persistent homology in the global analysis of data, in which short bars were disregarded as noise. Our meta-hypothesis, however, is that short bars do matter, and furthermore, they matter crucially when combining topology with machine learning. As a partial defense for this claim, we provide a selected survey on the use of persistent homology in measuring texture, noise, local geometry, fractal dimension, and local curvature. We predict that the applications of persistent homology to machine learning will continue to advance in number, impact, and scope, as persistent homology is a mathematically motivated out-of-the-box tool that one can use to summarize not only the global topology but also the local geometry of a wide variety of datasets. The two most frequent contexts in which persistent homology is applied are point cloud persistent homology and sublevel set persistent homology. In point cloud persistent homology, the input is a finite set of points (a point cloud) residing in Euclidean space or some other metric space [Carlsson, 2009]. For any real number r > 0, we consider the union of all balls of radius r centered at some point in our point cloud; see Figure 1. This union of balls provides our filtration as the radius r increases. 1 A typical interpretation of the resulting persistent homology, from the global perspective, is that the long persistent homology bars recover the homology of the "true" underlying space from which the point cloud was sampled [Chazal and Oudot, 2008]. A more modern but increasingly utilized perspective is that the short persistent homology bars recover the local geometry -i.e. the texture, curvature, or fractal dimension of the point cloud data.
In sublevel set persistent homology, the input is instead a real-valued function f : Y → R defined on a space Y [Cohen-Steiner et al., 2007]. For example, Y may be a Euclidean space of some dimension. The filtration arises by considering the sublevel sets {y ∈ Y | f (y) ≤ r}. As the threshold r increases, the sublevel sets grow. One can think of f as encoding an energy, in which case sublevel set persistent homology encodes the shape of low-energy configurations . The length of a bar then measures how large of an energy barrier must be exceeded in order for a topological feature to be filled-in: a short bar corresponds to a feature that is quickly filled-in by exceeding a low energy barrier, whereas a long bar corresponds to a topological feature that persists over a longer range of energies; see Figure 2. Sublevel set persistent homology is frequently applied to grayscale image data or matrix data, where a real-valued entry of the image or matrix is interpreted as the value of the function f on a pixel.
We remark that the "union of balls" filtration for point cloud persistent homology can be viewed as a version of sublevel set persistent homology: a union of balls of radius r is the sublevel set at threshold r of the distance function to the set of points in the point cloud.
Persistent homology can be represented in two equivalent ways: either as a persistence barcode or as a persistence diagram; see Figure 3. Each interval in the persistence barcode is represented in the persistence diagram by a point in the plane, with its birth coordinate on the horizontal axis and with its death coordinate on the vertical axis. As the death of each feature is after its birth, persistence diagram points all lie above the diagonal line y = x. Short bars in the barcode correspond to persistence diagram points close to the diagonal, and long bars in the barcode correspond to persistence diagram points far from the diagonal.
1 In practice, the union of balls is stored or approximated by a simplicial complex, for example aČech or Vietoris-Rips complex [Chazal et al., 2013].

Examples measuring global shape
The earliest applications of topology to data measured the global shape of a dataset. In these examples, the long persistent homology bars represented the true homology underlying the data, whereas the small bars were ignored as artifacts of sampling noise. What do we mean by "global shape"? Consider, for example, conformations of the cyclo-octane molecule C 8 H 16 , which consists of a ring of eight carbons atoms, each bonded to a pair of hydrogen ams and Moy Topology applied to machine learning Example: Cyclo-Octane (C 8 H 16 ) data 1,000,000+ points in 24-dimensional space Figure 7: Conformation Space of Cyclo-Octane. Here we show how the set of conformations of cyclo-octane can be represented as a surface in a high dimensional space. On the left, we show various conformations of cyclo-octane as drawn by PyMol (www.pymol.org). In the center, these conformations are represented by the 3D coordinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix. As an example, the entry c 1,1,x of the matrix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a lower dimensional visualization of the data. conformation space. Freedman's method failed because the surface had selfintersections of the type discussed in this paper. Thus we developed our method for non-manifold surface reconstruction and applied it to the cyclooctane dataset.
To reduce complexity and avoid potential error due to hydrogen placement, we used only ring atoms to obtain a dataset {x i } 1,031,644 i=1 R 24 . We applied our algorithm to this dataset using parameters = 0.23, d t = 0.05, d p = 0.01, and p = 0.02. We used five di erent values of d l , given by 0.08, 0.09, 0.10, 0.11, and 0.12. We produced five di erent triangulations with 6,040; 7,114; 8,577; 10,503; and 13,194 vertices. We used the Plex and Linbox toolboxes to check the accuracy of the triangulations. For each of the five triangulations, we verified that every neighborhood B i (before decomposition) had Betti numbers 1,0,0. This is an accuracy check because any neighborhood B i should be homotopic to a point and should therefore have Betti numbers 1,0,0. We also computed Betti numbers for each of the five full triangulations. In all cases we found the Betti numbers to be 1,1,2. This consistency strongly suggests that the triangulations are all representative of the actual conformation space. A visualization of the triangulation with 6,044 vertices using the Isomap coordinate represen- In the center, these conformations are represented by the 3D coordinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix.
As an example, the entry c 1,1,x of the matrix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a lower dimensional visualization of the data.
conformation space. Freedman's method failed because the surface had selfintersections of the type discussed in this paper. Thus we developed our method for non-manifold surface reconstruction and applied it to the cyclooctane dataset.
To reduce complexity and avoid potential error due to hydrogen placement, we used only ring atoms to obtain a dataset {x i } 1,031,644 i=1 R 24 . We applied our algorithm to this dataset using parameters = 0.23, d t = 0.05, d p = 0.01, and p = 0.02. We used five di erent values of d l , given by 0.08, 0.09, 0.10, 0.11, and 0.12. We produced five di erent triangulations with 6, 040; 7,114; 8,577; 10,503; and 13,194 vertices. We used the Plex and Linbox toolboxes to check the accuracy of the triangulations. For each of the five triangulations, we verified that every neighborhood B i (before decomposition) had Betti numbers 1,0,0. This is an accuracy check because any neighborhood B i should be homotopic to a point and should therefore have Betti numbers 1,0,0. We also computed Betti numbers for each of the five full triangulations. In all cases we found the Betti numbers to be 1,1,2. This consistency strongly suggests that the triangulations are all representative of the actual conformation space. A visualization of the triangulation with 6,044 vertices using the Isomap coordinate represen- of cyclo-octane. Here we show how the set of conformations of cyclo-octane can be represented as a surface in a highleft, we show various conformations of cyclo-octane as drawn by PyMol (www.pymol.org). In the center, these conformations are rdinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix. As an example, rix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a ation of the data. r the nine examples we have investigated are shown in Table 2. These times were obtained on on dual quadcore workstation with 16 GB of RAM. The algorithm was implemented in Matlab ) using the optimization toolbox to solve the linear program in (6). Table 2 shows that pre-processing r the non-manifold examples. In the case of the non-manifold examples, the pre-processing is generally ulation. saturated eight-member cyclic compound with chemical formula C 8 H 16 . Cyclo-octane has received tional chemistry because it has multiple conformations of similar energy, a complex potential energy t (steric) influence from the hydrogen atoms on preferred conformations [32][33][34]. Cyclo-octane is also here are enumerative algorithms available which can provide a dense sampling of the conformation lgorithms show from first principles that the resulting conformation space has two degrees of freedom, ace is a surface (but not necessarily a manifold). reduction methods, we have previously analyzed the cyclo-octane conformation space [16]. In our ataset of 1,031,644 cyclo-octane conformations, enumerated using the triaxial loop closure algorithm . Each conformation is placed in Cartesian space via the 3D position coordinates of each atom in the ations are then aligned to a reference conformation such that the Eckart conditions are satisfied [37]. a given conformation are concatenated to obtain a vector in R 72 . The resulting collection is a dataset ich is presumed to describe a surface. In Brown et al. [16] we applied a variety of dimension reduction -octane dataset, one of which was Isomap [38]. A summary of our analysis using the Isomap reduction reduction, the next step in our analysis is surface reconstruction. Unfortunately, the Isomap repreo-octane conformation space is only a visualization, and is not accurate enough for use with a 3D methods. Therefore we applied Freedman's algorithm for surface reconstruction in the original high- of cyclo-octane. Here we show how the set of conformations of cyclo-octane can be represented as a surface in a highleft, we show various conformations of cyclo-octane as drawn by PyMol (www.pymol.org). In the center, these conformations are rdinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix. As an example, rix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a ation of the data. r the nine examples we have investigated are shown in Table 2. These times were obtained on on dual quadcore workstation with 16 GB of RAM. The algorithm was implemented in Matlab ) using the optimization toolbox to solve the linear program in (6). Table 2 shows that pre-processing r the non-manifold examples. In the case of the non-manifold examples, the pre-processing is generally ulation. saturated eight-member cyclic compound with chemical formula C 8 H 16 . Cyclo-octane has received tional chemistry because it has multiple conformations of similar energy, a complex potential energy t (steric) influence from the hydrogen atoms on preferred conformations [32][33][34]. Cyclo-octane is also here are enumerative algorithms available which can provide a dense sampling of the conformation lgorithms show from first principles that the resulting conformation space has two degrees of freedom, ace is a surface (but not necessarily a manifold). reduction methods, we have previously analyzed the cyclo-octane conformation space [16]. In our ataset of 1,031,644 cyclo-octane conformations, enumerated using the triaxial loop closure algorithm . Each conformation is placed in Cartesian space via the 3D position coordinates of each atom in the ations are then aligned to a reference conformation such that the Eckart conditions are satisfied [37]. a given conformation are concatenated to obtain a vector in R 72 . The resulting collection is a dataset ich is presumed to describe a surface. In Brown et al. [16] we applied a variety of dimension reduction -octane dataset, one of which was Isomap [38]. A summary of our analysis using the Isomap reduction reduction, the next step in our analysis is surface reconstruction. Unfortunately, the Isomap repreo-octane conformation space is only a visualization, and is not accurate enough for use with a 3D methods. Therefore we applied Freedman's algorithm for surface reconstruction in the original highation space. Freedman's method failed because the surface had self-intersections of the type discussed Persistent homology applied to data e of cyclo-octane. Here we show how the set of conformations of cyclo-octane can be represented as a surface in a highleft, we show various conformations of cyclo-octane as drawn by PyMol (www.pymol.org). In the center, these conformations are rdinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix. As an example, rix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a ation of the data.

Persistent homology applied to data
r the nine examples we have investigated are shown in Table 2. These times were obtained on eon dual quadcore workstation with 16 GB of RAM. The algorithm was implemented in Matlab ) using the optimization toolbox to solve the linear program in (6). Table 2 shows that pre-processing r the non-manifold examples. In the case of the non-manifold examples, the pre-processing is generally ulation. saturated eight-member cyclic compound with chemical formula C 8 H 16 . Cyclo-octane has received tional chemistry because it has multiple conformations of similar energy, a complex potential energy nt (steric) influence from the hydrogen atoms on preferred conformations [32][33][34]. Cyclo-octane is also here are enumerative algorithms available which can provide a dense sampling of the conformation lgorithms show from first principles that the resulting conformation space has two degrees of freedom, pace is a surface (but not necessarily a manifold). reduction methods, we have previously analyzed the cyclo-octane conformation space [16]. In our ataset of 1,031,644 cyclo-octane conformations, enumerated using the triaxial loop closure algorithm ]. Each conformation is placed in Cartesian space via the 3D position coordinates of each atom in the mations are then aligned to a reference conformation such that the Eckart conditions are satisfied [37]. a given conformation are concatenated to obtain a vector in R 72 . The resulting collection is a dataset ich is presumed to describe a surface. In Brown et al. [16] we applied a variety of dimension reduction -octane dataset, one of which was Isomap [38]. A summary of our analysis using the Isomap reduction reduction, the next step in our analysis is surface reconstruction. Unfortunately, the Isomap repreo-octane conformation space is only a visualization, and is not accurate enough for use with a 3D n methods. Therefore we applied Freedman's algorithm for surface reconstruction in the original highation space. Freedman's method failed because the surface had self-intersections of the type discussed e developed our method for non-manifold surface reconstruction and applied it to the cyclo-octane Persistent homology applied to data  atoms; see Figure 4 (left). The locations of the carbon atoms in a conformation approximately determine the locations of the hydrogen atoms via energy minimization, and hence each molecule conformation can be mapped to a point in R 24 = R 8·3 , as the location of each carbon atom can be specified by three coordinates. This map realizes the conformation space of cyclo-octane as a subset of R 24 , and then we mod out by rigid rotations and translations. Topologically, the conformation space of cyclo-octane turns out to be the union of a sphere with a Klein bottle, glued together along two circles of singularities; see Figure 4 (right). This model was obtained by Martin et al. [2010], Martin and Watson [2011], Brown et al. [2008], who furthermore obtain a triangulation of this dataset (a representation of the dataset as a union of vertices, edges, and triangles).

Adams and Moy
Topology applied to machine learning

ms and Moy Topology applied to machine learnin
Example: Cyclo-Octane (C 8 H 16 ) data 1,000,000+ points in 24-dimensional space In the center, these conformations are represented by the 3D coordinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix.
As an example, the entry c 1,1,x of the matrix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a lower dimensional visualization of the data.
conformation space. Freedman's method failed because the surface had selfintersections of the type discussed in this paper. Thus we developed our method for non-manifold surface reconstruction and applied it to the cyclooctane dataset.
To reduce complexity and avoid potential error due to hydrogen placement, we used only ring atoms to obtain a dataset {x i } 1,031,644 i=1 R 24 . We applied our algorithm to this dataset using parameters = 0.23, d t = 0.05, d p = 0.01, and p = 0.02. We used five di erent values of d l , given by 0.08, 0.09, 0.10, 0.11, and 0.12. We produced five di erent triangulations with 6, 040; 7,114; 8,577; 10,503; and 13,194 vertices. We used the Plex and Linbox toolboxes to check the accuracy of the triangulations. For each of the five triangulations, we verified that every neighborhood B i (before decomposition) had Betti numbers 1,0,0. This is an accuracy check because any neighborhood B i should be homotopic to a point and should therefore have Betti numbers 1,0,0. We also computed Betti numbers for each of the five full triangulations. In all cases we found the Betti numbers to be 1,1,2. This consistency strongly suggests that the triangulations are all representative of the actual conformation space. A visualization of the triangulation with 6,044 vertices using the Isomap coordinate represen-  In the center, these conformations are represented by the 3D coordinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix. As an example, the entry c 1,1,x of the matrix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a lower dimensional visualization of the data. conformation space. Freedman's method failed because the surface had selfintersections of the type discussed in this paper. Thus we developed our method for non-manifold surface reconstruction and applied it to the cyclooctane dataset.
To reduce complexity and avoid potential error due to hydrogen placement, we used only ring atoms to obtain a dataset {x i } 1,031,644 i=1 R 24 . We applied our algorithm to this dataset using parameters = 0.23, d t = 0.05, d p = 0.01, and p = 0.02. We used five di erent values of d l , given by 0.08, 0.09, 0.10, 0.11, and 0.12. We produced five di erent triangulations with 6, 040; 7,114; 8,577; 10,503; and 13,194 vertices. We used the Plex and Linbox toolboxes to check the accuracy of the triangulations. For each of the five triangulations, we verified that every neighborhood B i (before decomposition) had Betti numbers 1,0,0. This is an accuracy check because any neighborhood B i should be homotopic to a point and should therefore have Betti numbers 1,0,0. We also computed Betti numbers for each of the five full triangulations. In all cases we found the Betti numbers to be 1,1,2. This consistency strongly suggests that the triangulations are all representative of the actual conformation space. A visualization of the triangulation with 6,044 vertices using the Isomap coordinate represen-23 f cyclo-octane. Here we show how the set of conformations of cyclo-octane can be represented as a surface in a hight, we show various conformations of cyclo-octane as drawn by PyMol (www.pymol.org). In the center, these conformations are inates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix. As an example, denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a on of the data.

Datasets have shapes
the nine examples we have investigated are shown in Table 2. These times were obtained on n dual quadcore workstation with 16 GB of RAM. The algorithm was implemented in Matlab using the optimization toolbox to solve the linear program in (6). Table 2 shows that pre-processing the non-manifold examples. In the case of the non-manifold examples, the pre-processing is generally ation.
turated eight-member cyclic compound with chemical formula C 8 H 16 . Cyclo-octane has received nal chemistry because it has multiple conformations of similar energy, a complex potential energy (steric) influence from the hydrogen atoms on preferred conformations [32][33][34]. Cyclo-octane is also re are enumerative algorithms available which can provide a dense sampling of the conformation orithms show from first principles that the resulting conformation space has two degrees of freedom, ce is a surface (but not necessarily a manifold f cyclo-octane. Here we show how the set of conformations of cyclo-octane can be represented as a surface in a hight, we show various conformations of cyclo-octane as drawn by PyMol (www.pymol.org). In the center, these conformations are inates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix. As an example, denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a on of the data.
the nine examples we have investigated are shown in Table 2. These times were obtained on n dual quadcore workstation with 16 GB of RAM. The algorithm was implemented in Matlab using the optimization toolbox to solve the linear program in (6). Table 2 shows that pre-processing the non-manifold examples. In the case of the non-manifold examples, the pre-processing is generally ation.
turated eight-member cyclic compound with chemical formula C 8 H 16 . Cyclo-octane has received nal chemistry because it has multiple conformations of similar energy, a complex potential energy (steric) influence from the hydrogen atoms on preferred conformations [32][33][34]. Cyclo-octane is also re are enumerative algorithms available which can provide a dense sampling of the conformation orithms show from first principles that the resulting conformation space has two degrees of freedom, ce is a surface (but not necessarily a manifold of cyclo-octane. Here we show how the set of conformations of cyclo-octane can be represented as a surface in a highft, we show various conformations of cyclo-octane as drawn by PyMol (www.pymol.org). In the center, these conformations are inates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix. As an example, denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a ion of the data.
the nine examples we have investigated are shown in Table 2. These times were obtained on n dual quadcore workstation with 16 GB of RAM. The algorithm was implemented in Matlab using the optimization toolbox to solve the linear program in (6). Table 2 shows that pre-processing the non-manifold examples. In the case of the non-manifold examples, the pre-processing is generally ation.
aturated eight-member cyclic compound with chemical formula C 8 H 16 . Cyclo-octane has received onal chemistry because it has multiple conformations of similar energy, a complex potential energy (steric) influence from the hydrogen atoms on preferred conformations [32][33][34]. Cyclo-octane is also re are enumerative algorithms available which can provide a dense sampling of the conformation orithms show from first principles that the resulting conformation space has two degrees of freedom, ce is a surface (but not necessarily a manifold). duction methods, we have previously analyzed the cyclo-octane conformation space [16]. In our taset of 1,031,644 cyclo-octane conformations, enumerated using the triaxial loop closure algorithm Each conformation is placed in Cartesian space via the 3D position coordinates of each atom in the ations are then aligned to a reference conformation such that the Eckart conditions are satisfied [37].

Adams and Moy Topology applied to machine learning
Example: Cyclo-Octane (C 8 H 16 ) data 1,000,000+ points in 24-dimensional space In the center, these conformations are represented by the 3D coordinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix.
As an example, the entry c 1,1,x of the matrix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a lower dimensional visualization of the data.
conformation space. Freedman's method failed because the surface had selfintersections of the type discussed in this paper. Thus we developed our method for non-manifold surface reconstruction and applied it to the cyclooctane dataset.
To reduce complexity and avoid potential error due to hydrogen placement, we used only ring atoms to obtain a dataset {x i } 1,031,644 i=1 R 24 . We applied our algorithm to this dataset using parameters = 0.23, d t = 0.05, d p = 0.01, and p = 0.02. We used five di erent values of d l , given by 0.08, 0.09, 0.10, 0.11, and 0.12. We produced five di erent triangulations with 6,040; 7,114; 8,577; 10,503; and 13,194 vertices. We used the Plex and Linbox toolboxes to check the accuracy of the triangulations. For each of the five triangulations, we verified that every neighborhood B i (before decomposition) had Betti numbers 1,0,0. This is an accuracy check because any neighborhood B i should be homotopic to a point and should therefore have Betti numbers 1,0,0. We also computed Betti num-  In the center, these conformations are represented by the 3D coordinates of their atoms. The coordinates are concatenated into vectors and shown as columns of a data matrix.

Datasets have shapes
As an example, the entry c 1,1,x of the matrix denotes the x-coordinate of the first carbon atom in the first molecule. On the right, the Isomap method is used to obtain a lower dimensional visualization of the data.
conformation space. Freedman's method failed because the surface had self-   A Klein bottle, like a sphere, is a 2-dimensional manifold. Whereas a sphere can be embedded in 3-dimensional space, a Klein bottle requires at least four dimensions in order to be embedded without self-intersections. When a sphere and Klein bottle are glued together along two circles, the union is no longer a manifold (near the circles, the space looks like the tail of a dart instead of like a sheet of paper). However, the result is still a 2-dimensional stratified space. In Figure 5, we compute the persistent homology of a point cloud dataset of 1,000,000 cyclo-octane molecule configurations. The short bars are interpreted as noise, whereas the long bars are interpreted as attributes of the underlying shape. We obtain a single connected component, a single 1-dimensional hole, and two significant 2-dimensional homology features. These homology signatures agree with the homology of the union of a sphere with a Klein bottle, glued together along two circles of singularities.

Datasets have shapes
One of the first applications of persistent homology was to measure the global shape of a dataset of image patches [Carlsson et al., 2008]. This dataset of natural 3 × 3 pixel patches from black-and-white photographs from indoor and outdoor scenes in fact has three different global shapes! The most common patches lie along a circle of possible directions of linear gradient patches (varying from black to gray to white). The next most common patches lie along a three circle model, additionally including a circle's worth of horizontal quadratic gradients, and a circle's worth of vertical quadratic gradients. At the next level of resolution, the most common patches in some sense lie along a Klein bottle. All three of these models -the circle, the three circles, and the Klein bottle -are global models, summarizing the global shape of the dataset at different resolutions.

Examples measuring local geometry
When reading persistent homology barcodes, humans are not trained to read the small noisy bars. Nor are we necessary capable of doing so on our own, if there are thousands of small bars! Though a single long bar in persistent homology may carry a lot of information, a single small bar typically does not. However, together a collection of small bars may unexpectedly carry a large amount of geometric content. A long bar is a trumpet solo -piercing through to be heard over the orchestra with ease. The small bars are the string section -each small bar on its own is relatively quiet, but in concert the small bars together deliver a powerful message. We survey several modern examples where small persistent homology bars are now the signal, instead of the noise. Birds, fish, and insects move as flocks, schools, and hordes in a way which is determined by collective motion: each animal's next motion is a random function of the location of its nearby neighbors. In a flock of thousands of birds, there is an impressively large amount of time-varying geometry, including for example all n 2 pairwise distances, where n is the number of birds; see Figure 6. How can one summarize this much geometric content for use in machine learning tasks, say to predict how the motion of the flock will vary next, or to predict some of the parameters in a mathematical model approximately governing the motion of the birds? Persistent homology has been used in Topaz et al. [2015], Ulmer et al. [2018], Xian et al. [2020] to reduce a large collection of geometric content down to a concise summary. These datasets of animal swarms do not lie along beautiful manifolds (global shapes), but nevertheless there is a wealth of information in the local geometry as measured by the short persistent homology bars.
Other recent work has used persistent homology to characterize the complexity of geometric objects. Bendich et al. [2016b] apply sublevel set persistent homology to the study of brain artery trees, examining the effects of age and sex on the barcodes generated from artery trees. While younger brains have artery trees containing more local twisting and branching, older brains are sparser with fewer small branches and leaves. The authors find that certain statistics extracted from persistent homology are sensitive to these differences. They also notice that it is not the longest bars, but the bars of medium length that are the most discriminatory.
. Motta et al. / Physica D 380-381 (2018) 17-30 27 plicial complexes at connectivity parameter r = 0.14 for the nandots extracted from the surfaces (row 2), er spectral density plots of the (mean centered) surfaces (row 4) for one simulation on a 256 ⇥ 256 spatial 1, ⌫ = 0.75, = 0, ⌘ = 10, b = 0.99b T (columns 1 and 3) or 0.90b T (columns 2 and 4). Persistence pairs topological holes which were created at a connectivity parameter r  0.14, and which are destroyed at a ological holes shown in the simplicial complexes (row 2). Holes in the simplicial complex corresponding to , and the corresponding persistence pairs are highlighted by red circles. (For interpretation of the references ersion of this article.) to changes in n the number In other datasets where points are nearly evenly spaced, barcodes will consist of bars with mostly similar birth and death times. Consider for instance the point cloud persistent homology for a square grid of points in the plane: all 0-dimensional bars are identical and adding a small amount of noise to the points will result in a small change to the bars. The same is true for 1-dimensional bars. With this in mind, Motta et al. [2018] use persistent homology to measure the order, or regularity, of lattice-like datasets, focusing on hexagonal grids formed by ion bombardment of solid surfaces; see Figure 7. The authors' techniques use the variance of 0-dimensional homology bar lengths, and the sum of the lengths of 1-dimensional homology bars, as well as a particular linear combination of the two especially suited to hexagonal lattices. Their results suggest that techniques based on persistent homology can provide useful measures of order that are sensitive to both large scale and small scale defects in lattices. Point cloud persistence has also been used to summarize the local order and randomness in other materials science contexts, including amorphous solids and glass [Hiraoka et al., 2016, Nakamura et al., 2015, Hirata et al., 2020 and nanoporous materials used in gas adsorption [Krishnapriyan et al., 2020].

Templated patterns
Though the above examples focus on point cloud persistence, sublevel set persistent homology has also been used to detect the local geometry of functions. Kramár et al. [2016] use sublevel set persistence to summarize the complicated spatio-temporal patterns that arise from dynamical systems modeling fluid flow, including turbulence (Kolmogorov flow) and heat convection (Rayleigh-Bénard convection). With sublevel set persistence, Zeppelzauer et al. [2016] improve 3D surface classification, including on an archaeology task of segmenting engraved regions of rock from the surrounding natural rock surface. In a task of tracking automobiles, Bendich et al. [2016a] use the sublevel set persistent homology of driver speeds in order to characterize driver behaviors and prune out improbable paths from their multiple hypothesis tracking framework.

Theory of how persistent homology measures local geometry
Recent work has begun to formalize the idea that persistent homology measures local geometry. Bubenik et al. [2020] explore the effect of the curvature of a space on the persistent homology of a sample of points, focusing on disks in spaces with constant curvature. Their work includes theoretical results about the persistence of triangles in these spaces, and they are also able to demonstrate experimentally that persistent homology in dimensions 0 and 1 can be used to accurately estimate the curvature given a random sample of points. Since the disks in spaces with different curvature are homeomorphic, the differences in persistent homology cannot be due to topology, but rather result from the geometric features of the spaces.
Fractal dimension is another measure of local geometry, and indeed some of the earliest applications of persistent homology in Vanessa Robins's PhD thesis were motivated as a way to capture the fractal dimension of an infinite set in Euclidean space [Robins, 2000, MacPherson andSchweinhart, 2012]. Can this also be applied to datasets, i.e. to random collections of finite sets of points? Given a random sample of points from a measure,  use persistent homology to detect the fractal dimension of the support of the measure. This notion of persistent homology fractal dimension agrees with the Hausdorff/box-counting dimension for 0-dimensional persistent homology and a restricted class of measures; see Schweinhart [2019Schweinhart [ , 2020 for further theoretical developments.
A related line of work studies what can be proven about the topology of random point clouds, typically as the number of points in the point cloud goes to infinity [Adler et al., 2014, Bobrowski and Kahle, 2014, Bobrowski et al., 2015, Kahle, 2011. The magnitude [Leinster, 2013] and magnitude homology Willerton, 2017, Leinster andShulman, 2017] of a metric space measure both local and global properties; recent and ongoing work is being done to connect magnitude with persistent homology [Otter, 2018, Govc andHepworth, 2021]. See also Weinberger [2019] for connections between sublevel set persistent homology and the geometry of spaces of functions, including Lipschitz constants of functions. We predict that much more work demonstrating how local geometric features can be recovered from persistent homology barcodes will take place over the next decade.

Machine learning
Because persistent homology gives a concise description of the shape of data, it is not surprising that recent work has incorporated persistent homology into machine learning. Researchers have taken at least three distinct approaches: persistence barcodes have been adapted to be input to machine learning algorithms, topological methods have been used to create new algorithms, and persistent homology has been used to analyze machine learning algorithms.
Perhaps the most natural of these approaches is inputting persistence data into a machine learning algorithm. Though the persistent homology bars provide a summary of both local geometry and global topology, for a quantitative summary to be fully applicable it needs to be amenable for use in machine learning tasks. From barcodes, Bubenik [2015] creates persistence landscapes, which live in a Banach space of functions. Persistence landscapes are created by rotating a persistence diagram on its sideso that the diagonal line y = x becomes as flat as the horizon -and then using the persistence diagram points to trace out the peaks in a mountain landscape profile. From barcodes, Adams et al. [2017] create persistence images, a Euclidean vectorization enabling a diverse class of machine learning tools to be applied (see also Chen et al. [2015], Reininghaus et al. [2015]). A persistence image is created by taking a sum of Gaussians, one centered on each point in a persistence diagram, and then pixelating that surface to form an image. By analogy, recall that in point cloud persistent homology, one "blurs their vision" when looking at a dataset by replacing each data point with a ball -this is similar to the process of "blurring one's vision" when looking at a persistence diagram in order to create a persistence image.
Persistence landscapes and images are only two of the many different methods that have recently been invented in order to transform persistence barcodes into machine learning input. Algorithms that require only a distance matrix, such as many clustering or dimensionality reduction algorithms, can be applied on the bottleneck or Wasserstein distances between persistence barcodes [Cohen-Steiner et al., 2007, Mileyko et al., 2011, Kerber et al., 2017. Techniques for vectorizing persistence barcodes involve heat kernels [Carrière et al., 2015], entropy [Merelli et al., 2015, Atienza et al., 2018, rings of algebraic functions [Adcock et al., 2016], tropical coordinates [Kališnik, 2019], complex polynomials [Di Fabio and Ferri, 2015], and optimal transport [Carrière et al., 2017], among others. Some of these techniques, including those by Zhao and Wang [2019] and Divol and Polonik [2019], allow one to learn from the data the vectorization parameters that are best suited for a machine learning task on that dataset. Others allow one to plug persistent homology information directly into a neural network [Hofer et al., 2017]. Recent research on incorporating persistence as input for machine learning is vast and varied, and the above collection of references is far from complete.
As for the creation of new algorithms, persistent homology has recently been applied to regularization, a technique used in machine learning that penalizes overly complicated models to avoid overfitting. Chen et al. [2019] propose a "topological penalty function" for classification algorithms, which encourages a topologically simple decision boundary. Their method is based on measuring the relative importance of various connected components of the decision boundary via 0-dimensional persistent homology. They show how the gradient of such a penalty function can be computed, which is important for use in machine learning algorithms, and demonstrate their method on several examples. Similar work using topological methods to examine a decision boundary can also be found in Varshney and Ramamurthy [2015] and Ramamurthy et al. [2019].
Finally, other recent work has used persistent homology to analyze neural networks. Naitzat et al. [2020] provide experimental evidence that neural networks operate by simplifying the topology of a dataset. They examine the topology of a dataset and its images at the various layers of a neural network performing classification, finding that the corresponding barcodes become simpler as the data progresses though the network. Additionally, they observe the effects of different shapes of neural networks and different activation functions. They find that deeper neural networks have a tendency to simplify the topology of a dataset more gradually than shallow networks, and that networks with ReLU activation tend to simplify topology more in the earlier layers of a network than other activation functions.

Conclusion
Topological tools are often described as being able to stitch local data together in order to describe global features: from local to global. The history of applied topology, however, has in some sense gone in the reverse direction -from global to local -as surveyed above! Applied topology was developed in part to summarize global features in a point cloud dataset, as in the examples of the conformations of the cyclo-octane molecule or the collection of 3 × 3 pixel patches from images. If global shapes are the focus, long persistent homology bars are interpreted as the relevant features, while small bars are often disregarded as sampling artifacts or noise. However, in more recent applications, and in particular when using applied topology in concert with machine learning, it is often many short persistent homology bars that together form the signal. One of the biggest benefits of applied topology is that one need not choose a scale beforehand: persistent homology provides a useful summary of both the local and global features in a dataset, and this summary has been made accessible for use in machine learning tasks.
We have seen how the short bars can be a measure of local geometry, texture, curvature, and fractal dimension; their sensitivity to various features of datasets leads to the wide variety of applications surveyed here. Because persistent homology provides a concise, reductive view of the geometry of a dataset, for instance in the examples studying brain artery trees or hexagonal grids, it is not hard to imagine the potential applications to machine learning problems. This has led to recent techniques that turn barcodes into machine learning input, exemplified by persistence landscapes and persistence images. We hope that this wealth of recent work, which has shifted more attention to short persistent homology bars and the geometric information they summarize, will inspire further research at the intersection of applied topology, local geometry, and machine learning.

Funding
This material is based upon work supported by the National Science Foundation under Grant Number 1934725.