Formal Methods for Safety-Critical Machine Learning: A Systematic Literature Review

Newcomb, Alexandra; Ochoa, Omar

doi:10.3389/frai.2026.1749956

SYSTEMATIC REVIEW article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

This article is part of the Research TopicVerifying Autonomy: Formal Methods for Reliable Decision-MakingView all articles

Formal Methods for Safety-Critical Machine Learning: A Systematic Literature Review

Provisionally accepted

Alexandra Newcomb^*

Omar Ochoa

Embry–Riddle Aeronautical University, Daytona Beach, United States

The final, formatted version of the article will be published soon.

The integration of Machine Learning (ML) systems into safety-critical domains heightens the need for rigorous safety guarantees. Traditional testing-based verification techniques are insufficient for fully capturing the complex, data-driven, and non-deterministic behaviors of modern ML models. Therefore, applying formal methods—which provide rigorous mathematical guarantees of a system's adherence to specified properties—to ML systems has been of particular interest in recent years. This work presents a comprehensive Systematic Literature Review of peer-reviewed research from 2020 to mid-2025 on the use of formal methods to enhance ML safety, specifically for safety-critical applications. Following a structured protocol, 46 studies were identified across four major digital libraries and classified into eight categories: Reachability and Over-Approximation Techniques, SMT-based Verification and Abstraction/Refinement, MILP/ILP Approaches, Model Checking Approaches, Runtime Verification Approaches, Shielding Techniques, Control Barrier Function Methods, and Risk Verification Methods. The review synthesizes methodological advances, application areas, and comparative strengths over traditional verification, while also presenting bibliometric trends in the literature. Analysis reveals persistent challenges and gaps, including scalability to large and complex models, integration with training processes, and limited real-world validation. Future research opportunities include developing integrated training-verification loops, scalable verification frameworks, hybrid formal methods, and novel techniques for emerging ML paradigms such as Large Language Models. This work serves both as a state-of-the-art reference and as a roadmap for advancing the safe deployment of ML systems.

Keywords: formal methods, machine learning, Safe Autonomy, Safety-critical systems, software verification

Received: 19 Nov 2025; Accepted: 28 Jan 2026.

Copyright: © 2026 Newcomb and Ochoa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Alexandra Newcomb

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.