TY  - JOUR
AU  - Hernández-Orozco, Santiago 
AU  - Zenil, Hector 
AU  - Riedel, Jürgen 
AU  - Uccello, Adam 
AU  - Kiani, Narsis A. 
AU  - Tegnér, Jesper 
PY  - 2021
M3  - Original Research
TI  - Algorithmic Probability-Guided Machine Learning on Non-Differentiable Spaces
JO  - Frontiers in Artificial Intelligence
UR  - https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2020.567356
VL  - Volume 3 - 2020
SN  - 2624-8212
N2  - We show how complexity theory can be introduced in machine learning to help bring together apparently disparate areas of current research. We show that this model-driven approach may require less training data and can potentially be more generalizable as it shows greater resilience to random attacks. In an algorithmic space the order of its element is given by its algorithmic probability,  which arises naturally from computable processes. We investigate the shape of a discrete algorithmic space when performing regression or classification using a loss function parametrized by algorithmic complexity, demonstrating that the property of differentiation is not required to achieve results similar to those obtained using differentiable programming approaches such as deep learning. In doing so we use examples which enable the two approaches to be compared (small, given the computational power required for estimations of algorithmic complexity). We find and report that (i) machine learning can successfully be performed on a non-smooth surface using algorithmic complexity; (ii) that solutions can be found using an algorithmic-probability classifier, establishing a bridge between a fundamentally discrete theory of computability and a fundamentally continuous mathematical theory of optimization methods; (iii) a formulation of an algorithmically directed search technique in non-smooth manifolds can be defined and conducted; (iv) exploitation techniques and numerical methods for algorithmic search to navigate these discrete non-differentiable spaces can be performed; in application of the (a) identification of generative rules from data observations; (b) solutions to image classification problems more resilient against pixel attacks compared to neural networks; (c) identification of equation parameters from a small data-set in the presence of noise in continuous ODE system problem, (d) classification of Boolean NK networks by (1) network topology, (2) underlying Boolean function, and (3) number of incoming edges.
ER  -