Read "Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda" at NAP.edu

Page 11 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

1

Engineering Safety-Critical Systems in the Age of Machine Learning

1.1 SAFETY-CRITICAL SYSTEMS

At the beginning of the 21st century, the National Academy of Engineering published a report identifying the 20 greatest engineering achievements of the 20th century.^1,2 These engineering achievements have fundamentally transformed every aspect of our lives and enabled new visions for our future. Among the greatest achievements, the list contains airplanes, automobiles, and nuclear technologies, shown in Figure 1-1. In addition to being historical engineering accomplishments, these achievements are also prototypical examples of safety-critical systems. These are systems in which system failures or malfunctions can result in harm to system users, the public, the environment, or can have other catastrophic consequences. In such systems, safety is the top priority and cannot be compromised for other design considerations.

Over the past few decades, new rigorous engineering approaches have been developed for designing systems that can meet the most stringent and demanding socially acceptable safety criteria—examples include the nuclear reactor, automobile, and airplane (Figure 1-1). The results have been impressive. For example, in aircraft design, the 2022 fatality risk for jet aircraft in commercial aviation was 0.11 per million sectors, which translates to one human flying every day for approximately 25,000 years before

___________________

¹ National Academy of Engineering, 2003, “Greatest Achievements of the 20th Century,” http://www.greatachievements.org.

² G. Constable and B. Somerville, 2003, A Century of Innovation: Twenty Engineering Achievements That Transformed Our Lives, Joseph Henry Press, https://doi.org/10.17226/10726.

Page 12 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

**FIGURE 1-1** Examples of safety-critical systems include nuclear reactors, automobiles, and airplanes.
SOURCE: (*left*) Michael Kappel, 2011, “Nuclear Cooling Towers,” https://www.flickr.com/photos/m-i-k-e/6541544889/in/photostream. CC BY-ND 2.0; (*middle*) iStock.com/Tramino; (*right*) iStock.com/Jon Tetzlaff.

they experience a fatal accident.³ Aircraft are not only marvelous engineering achievements because humans can now fly, but because they are also some of the most dependable and reliable engineering systems ever built.

Safety-critical systems are essential in preventing catastrophic outcomes in many industrial sectors and domains, and they are integral to many aspects of daily life. Examples of safety system applications are listed in Table 1-1.

Given the necessity for safe outcomes and high reliability of these systems, many are regulated and have rigorous safety and reliability standards that codify accepted risk management requirements for domain-specific hazards. Functional safety assessment, a common practice for engineering safety systems, is the systematic evaluation of potential system failures and faults that is intended to inform system designs that ensure safe outcomes. It incorporates system-wide hazard analysis, specification, design, verification, and validation techniques that are essential to establishing safety system performance. The functional safety standard International Electrotechnical Commission (IEC) 61508-1⁴ specifies basic functional safety requirements for electronic and programmable industrial robotics that are used across many safety-critical domains. These basic requirements are often expanded and incorporated into new standards that address unique needs for specific domains, such as vehicles (ISO 26262),⁵ railway systems (IEC 62425),⁶ and aviation applications (DO-178C).⁷ Beyond mitigating risks associated with faults, the “safety of the intended functionality” approach in International Organization for Standardization

___________________

³ International Air Transport Association (IATA), 2023, “IATA Releases 2022 Airline Safety Performance,” Press Release No. 7, March 7, https://www.iata.org/en/pressroom/2023-releases/2023-03-07-01.

⁴ International Electrotechnical Commission (IEC), 2010, “Functional Safety of Electrical/Electronic/Programmable Electronic Safety-Related Systems – Part 1–7,” IEC 61508.

⁵ International Organization for Standardization (ISO), 2011, “Road Vehicles—Functional Safety: Part 1: Vocabulary,” https://www.iso.org/standard/43464.html.

⁶ International Electrotechnical Commission, 2025, “Railway Applications—Communication, Signalling and Processing Systems—Safety Related Electronic Systems for Signalling,” IEC 62425:2025, https://webstore.iec.ch/en/publication/68909.

⁷ D. Wright, Z. Stephenson, and M. Beeby, 2021, Efficient Verification Through the DO-178C Life Cycle, Rapita Systems, https://www.do178.org.

Page 13 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

TABLE 1-1 Some Common Classes of Safety-Critical Systems and Respective Domains

Aviation Systems: Flight control systems Engine control systems Collision avoidance systems Navigation systems	Automotive Systems: Anti-lock braking systems (ABS) Electronic stability control (ESC) Adaptive cruise control Collision avoidance systems	Energy and Utilities: Power grid control systems Gas pipeline control systems Water treatment and distribution systems
Medical Devices and Health Care: Life support systems Ventilators, defibrillators Radiation therapy machines Diagnostic assistance and imaging	Building and EMS Safety Systems: Fire detection and suppression systems Man-down and access control systems Emergency call systems (e.g., 911) Communications for public safety	Nuclear Power Systems: Reactor control systems Emergency shutdown systems Radiation monitoring systems Cooling systems
Industrial Process Control: Chemical plant process control systems Oil and gas refinery control systems Power plant control systems	Automated Manufacturing Systems: Robotic control systems Safety lockout systems in manufacturing plants	Railway Systems: Train control and braking systems. Automatic train protection (ATP) systems Positive train control (PTC) systems
Space Systems: Launch vehicle control systems Satellite control systems Life support systems for astronauts	Traffic Control Systems: Intelligent transportation systems (ITS) Traffic light control systems	Defense Systems: Missile guidance systems Aircraft avionics systems Command and control systems

(ISO) 21448:2022⁸ complements the failure-focused functional safety methods in autonomous systems by addressing the risks that arise when systems operate as intended but encounter situations that exceed their functional capabilities or foreseeable misuses.

What Is Safety?

A common misconception is that safety is a binary notion, that a system is either safe or unsafe. Safety is also not an absolute notion: there is no perfect safety, and all systems have some element of risk.

In the spirit of establishing a consistent vocabulary as well as clarifying some notions, the committee defines the following terms below: harm, safety risk, tolerable risk, and safety (adapted from ISO/IEC Guide 51).⁹

Harm is typically associated with physical injury or damage to the health of people (either system users or other affected members of the public) or damage to property or the environment. Safety risk is defined as a function of both how frequent the occurrence

___________________

⁸ ISO, 2022, “Road Vehicles—Safety of the Intended Functionality,” ISO 21448:2022, https://www.iso.org/standard/77490.html.

⁹ ISO, 2014, “Safety Aspects—Guidelines for Their Inclusion in Standards,” ISO/IEC Guide 51:2014, https://www.iso.org/standard/53940.html.

Page 14 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

of harm is as well as the severity of the harm. Therefore, safety risk is a flexible notion that captures both rare but potentially catastrophic harmful events (e.g., a nuclear reactor meltdown) as well as more frequent harmful events with moderate severity (e.g., a car crash). Engineering safe systems relies on identifying the potential hazards and introducing effective safeguards that either minimize the probability of the harmful event occurring or reduce the severity of the harm, or both.

A key challenge in many safety-critical systems is to define a notion of tolerable risk—the level of risk that is socially acceptable under specified contexts or operational conditions. Defining tolerable risk is difficult and requires a dialogue among engineers, corporations, regulators, and the public. Furthermore, tolerable risk changes from context to context. For example, an airplane used for commercial purposes will have a different tolerable risk than the same aircraft in a military context. The notion of tolerable risk typically changes depending on the development phase of a fundamentally new or novel system. For example, as innovators, the Wright brothers were willing to accept higher risk associated with human flight than is accepted today—a choice that enabled one of the greatest engineering achievements in history. Therefore, there is an underlying relationship between safety and innovation, as the determination of tolerable risk can negatively affect both safety (if too high) and innovation (if too low).

Once a notion of tolerable risk has been established, safety is said to be achieved when the system safety risk is below the socially accepted tolerable risk. The heart of safety engineering is to exhaustively identify all potential harms that the system could cause and assess the severity and probability of each harm. The severity and probability can then be combined to obtain the risk, and the goal of safety engineering is to develop a system design that pushes that risk below the socially acceptable tolerable risk. The design process often attempts to allocate risk or performance expectations to subsystems and components and to set risk reduction targets for each. Achieving those targets may require introducing sensing redundancy (e.g., radar and lidar in addition to visible light cameras), improving the resolution and reliability of the sensors, and so forth.

Given the breadth of safety-critical domains highlighted in Figure 1-2, it should not be surprising that tolerable risk is unique to each combination of application domain, systems application environment, and potential risk exposure of the system users and members of the public.

In functional safety assessments, the reliability and performance expectations of different systems and subsystems are often described using a target safety integrity level (SIL). These typically range from SIL 1 (least safety critical) to SIL 4 (most safety critical). For example, traffic lights in intersections could require SIL 1 systems, while fly-by-wire aircraft flight control systems would need SIL 4. Achieving a higher SIL classification typically involves more redundancy, more rigorous testing, and stricter design

Page 15 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

and operational constraints. SIL 4 systems are often those where failure could result in catastrophic consequences. As such, they require the most stringent safety measures and redundancies to minimize the likelihood of failure.

Engineering Safety-Critical Systems

Historically, rigorous system development, as well as software engineering processes, followed the so-called V-model depicted in Figure 1-2. This is also known as the verification and validation (V&V) model. The V-model is frequently used for safety-critical systems because of its emphasis on safety requirements analysis, rigorous verification, and testing. This process also clearly defines the steps involved in the system or software development, starting with requirement analysis and specification, through design and testing, including deployment, customer acceptance, and ongoing maintenance.

The goal of the process is to identify errors or challenges in system design as early as possible. Errors discovered late in the design process are far more costly to fix than errors that are identified in the requirement analysis phase. The design, verification, and testing phases typically bring together numerous disciplines such as control system

**FIGURE 1-2** The verification and validation framework for safe systems engineering.
SOURCE: iStock/Piscine.

Page 16 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

design, optimization, networking, software engineering, model checking, computer-aided verification, real-time systems, and software testing. Complex systems are made up of heterogeneous components, integrating computing hardware and software. A major aspect of verification and validation of such systems is determining whether the heterogeneous composition of components will result in a safe overall system.

From a safety perspective, for each potential hazard identified, a mitigation strategy needs to be developed, implemented, and further assessed to establish that any residual risks are tolerable.¹⁰ In many industries, risk mitigation strategies are presented as a “safety case” that captures the evidence and methods used to prove that safeguards have been put in place to predictably ensure safe outcomes in anticipate operational environments.¹¹ The success of this approach relies on defining operating environments that ensure safety is achieved when the system functions in environments similar to those considered during its design phase.

Recently, more iterative approaches have been developed, such as the DevOps model shown in Figure 1-3. It features continuous improvement across all stages, including both development and deployment and an iterative approach to the entire life cycle, to ensure sustained improvement and agility.

**FIGURE 1-3** The DevOps framework for safe systems engineering.
SOURCE: Modified from M. Kharnagy, 2016, “A Visual Representation of the DevOps Workflow,” https://commons.wikimedia.org/wiki/File:Devops-toolchain.svg. CC BY-SA 4.0.

___________________

¹⁰ N.G. Leveson, 2016, Engineering a Safer World: Systems Thinking Applied to Safety, MIT Press.

¹¹ T. Kelly, 1998, “Arguing Safety—A Systematic Approach to Safety Case Management,” PhD thesis, Department of Computer Science, University of York.

Page 17 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

An aspect of system safety that is difficult to capture in either the V-model or the DevOps framework is the role of the human in creating and preventing harms. Humans play different roles ranging from passive user (e.g., passenger) to controller (e.g., pilot). The passenger may cause harm by accidentally blocking a sensor or hitting a switch. A human controller may lose situational awareness, miss signs of an impending threat, and take incorrect actions. The design of the human–machine interaction to achieve high levels of safety is a central challenge that the human factors and human–computer interaction disciplines address. Current research seeks to model the extent to which system safety depends on the human operator. Another research avenue models the system and the human as a collaborative team and attempts to ensure that the behavior of the combined team achieves target risk levels.

1.2 EMERGENCE OF MACHINE LEARNING IN SAFETY-CRITICAL SYSTEMS

Machine learning (ML) models make predictions (or decisions) from data, identify patterns, and, in some instances, improve performance over time through experience. The formalization of the concept can be traced back to the late 1940s and early 1950s when researchers began exploring the idea of building computer programs that could learn from data and improve their performance over time.

One of the earliest significant contributions was the development of the perceptron, a type of artificial neural network that could be trained to recognize patterns, by Frank Rosenblatt in 1957. Only in the 21st century, with the advent of very large data sets and major advances in computing power, did ML see a significant resurgence and rapid development. This resurgence led to breakthroughs in deep learning and other subfields of ML, making it one of the most prominent and influential areas of computing today.

In 2012, a deep convolutional neural network (i.e., a neural network with multiple layers of computing units) called AlexNet achieved a breakthrough in recognition accuracy on the ImageNet benchmark. Each image in ImageNet is labeled with an object category (e.g., “airplane”) that is present in the image. AlexNet demonstrated dramatically improved performance in image classification tasks compared to all previous methods in computer vision. AlexNet’s success marked a turning point in the use of deep neural networks for computer vision and helped popularize deep learning as a powerful approach for a wide range of applications.

Researchers began to study more complex network architectures and techniques, which paved the way for the rapid advancement of deep learning in subsequent years. These advances have since seen wide application, including in computer vision, natural language processing, speech recognition, biology, and robotics—a list that continues to

Page 18 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

expand as researchers and entrepreneurs discover new tasks and domains where ML can be applied.

In particular, deep learning is being applied in safety-critical applications such as medical imaging, traffic control, self-driving vehicles, avionics, unmanned aerial and underwater vehicles, and mobile robots. Deep learning greatly expands the capabilities of these systems with richer perception and more efficient and effective controllers.

Taxonomy of Machine Learning

The main approaches of ML can be categorized as follows.

Supervised Learning

In supervised learning, models are trained on labeled data sets. In this context, each data point is paired with a known output or target value (often called the “label”). For example, in image recognition, each data point is an image, and the label might be “pedestrian.” The primary objective of supervised learning is to enable the learning algorithm to establish a functional mapping between input data and corresponding output labels. Given a new input, the trained model can apply this mapping to make classifications or predictions.

Notable applications of supervised learning encompass image classification, speech recognition, natural language translation, and spam email detection. Prominent approaches employed in this paradigm include linear regression, decision trees, support vector machines, and, most prominently, deep neural networks.

Unsupervised Learning

Unsupervised learning is a broad class of methods that find patterns, structures, or relationships in data without any target labels. Common objectives in unsupervised learning encompass clustering (grouping similar data points together), dimensionality reduction (simplifying complex data representations), and density estimation (fitting a probability model to the data). Practical applications include customer segmentation, anomaly detection, and topic modeling. Notable techniques in unsupervised learning encompass K-means clustering, principal component analysis (PCA), hierarchical clustering, and deep density estimation.

Reinforcement Learning

Reinforcement learning seeks to train an autonomous agent to make sequences of decisions within a dynamic environment. The aim is to enable such agents to maximize the sum of rewards received over time. Agents learn iteratively through trial and error and receive feedback in the form of rewards or punishments (negative rewards) based on their actions. Notable applications of reinforcement learning span domains such as

Page 19 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

game playing (e.g., the achievement of AlphaGo), robotic control, and the operation of autonomous systems. Many reinforcement learning problems are formulated as Markov decision problems (MDPs) but where the Markov transition dynamics and the rewards must be learned through interaction with the environment. The result of reinforcement learning is a policy or control law that maps from observed states to the appropriate control actions. Popular reinforcement learning algorithms include Q-learning, deep Q-networks, and proximal policy optimization.

Self-Supervised Learning

For complex objects, such as sentences, images, and trajectories, it is possible to train neural networks to predict parts of the object from the remainder of the object. In language, for example, the self-supervised objective is typically to predict the next word in a sentence based on all of the previous words in the sentence. Similarly, in images, a network can learn to predict a portion of the image that has been deleted. Self-supervised learning has two important applications. First, some applications, such as image in-painting (filling in the occluded parts of an image) and natural language generation (generating a plausible continuation of a sentence), can be solved directly via self-supervised learning. Second, self-supervised learning can learn high-dimensional representations that can provide a basis for subsequent supervised learning to solve specific application tasks, such as classifying images from X rays, computed tomography scans, and magnetic resonance imaging in medicine. This is known as self-supervised representation learning. Representation learning can also be achieved by the method of instance-contrastive learning wherein a neural network learns a representation of objects that is invariant to a set of perturbations (changes in lighting, motion blur, size, orientation, etc.).

Neurosymbolic Learning

This type of ML operates on the assumption that two critical elements are essential for achieving human-like intelligence: learning and compositionality. This process involves the ability to form and understand new, complex concepts by combining various basic elements, or “primitives.” These primitives are modular and reusable, each capable of learning. In particular, the corresponding models—often referred to as Neurosymbolic ML—are poised to enhance ML systems by improving capabilities such as abstraction, analogy, and reasoning, as well as by increasing explainability and facilitating safety-focused decision making.

Neurosymbolic models typically comprise an improvised stack of “perception” and “reasoning” sub-models integrated with knowledge-based approaches. Due to their manually tailored and domain-specific workflow designs, these algorithms primarily excel in simulated environments but face challenges in generalizing to new or unseen

Page 20 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

tasks. A significant limitation of these models is their restricted ability to comprehend complex environments. This shortcoming is arguably due to their loosely connected perception and reasoning/control components and the inherent difficulties in training such models.

Generative Pre-Trained Models

Generative pre-trained models, exemplified by GPT (Generative Pre-trained Transformer) and CLIP (Contrastive Language-Image Pre-training), represent a very recent but pivotal advancement in ML. By applying self-supervised training techniques, these models can map complex inputs (e.g., images, text, code) into complex outputs (again, images, text, code). They have significantly enhanced the fields of natural language understanding, image processing, and cross-modal tasks. They are pre-trained on extensive (Internet-scale) data sets containing diverse text and image data, thus enabling them to acquire comprehensive, generalized representations of both language and visual information. GPT models, for instance, excel in language generation, text completion, and code generation. CLIP, on the other hand, stands out in tasks that involve the joint comprehension of textual and visual information, such as image captioning and zero-shot image classification.

These models are said to be “pre-trained,” because the initial, large-scale self-supervised training is typically followed by a second phase of task-specific training using supervised or reinforcement learning. These models are also called “foundation models,” because a single pre-trained model can provide a foundation for creating many task-specific systems.

These basic concepts and models constitute the underpinnings of contemporary ML. ML has catalyzed numerous groundbreaking developments in artificial intelligence (AI) and data science. Other more specialized approaches include semi-supervised learning, transfer learning, domain adaptation, imitation learning, and online learning. The choice of which type of ML to apply depends on the specific problem and the availability of labeled data, among other factors.

Machine Learning in Safety-Critical Systems

The remarkable capabilities demonstrated by modern ML techniques, particularly deep learning, have inspired researchers and practitioners to incorporate ML into safety-critical systems to expand the functionality of these systems and the range of environments in which they can operate. Among the various application domains, the automotive industry has witnessed a revolutionary transformation, with deep learning playing a pivotal role, alongside notable advancements in robotics, controls, sensors, and lidar technology.

Page 21 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

In the automotive sector, the fusion of these technological innovations has set the stage for the realization of a future centered around self-driving vehicles such as those depicted in Figure 1-4. Both incumbent auto makers and new entrants have invested heavily to develop increasingly capable advanced driver-assistance systems as well as work toward automated driving in an expanding operational context.

To better understand the role of ML, the committee first reviewed the Society of Automotive Engineers (SAE) levels of automation.¹² These provide a structured way to categorize the degree of autonomy in vehicles, ranging from Level 0 to Level 5. These levels help clarify how much a vehicle can operate without human intervention and are used for discussing and understanding autonomous and semi-autonomous vehicle capabilities.

Level 0 (no automation) requires the human driver to be entirely responsible for driving, including control and monitoring. Basic driver assistance features such as anti-lock brakes are present but do not constitute automation.
Level 1 (driver assistance) introduces some driver assistance, such as steering or acceleration control but not both simultaneously. Adaptive cruise control is an example.
Level 2 (partial automation) enables the vehicle to simultaneously control both steering and acceleration/deceleration under specific conditions. The driver must stay engaged and be ready to take control.
Level 3 (conditional automation) allows the vehicle to manage all driving tasks in certain conditions. The driver can disengage but must be prepared to intervene when requested.
Level 4 (high automation) enables the vehicle to operate fully autonomously within predefined domains. No human intervention is needed within these boundaries.

**FIGURE 1-4** Self-driving vehicles by Cruise, Zoox, and Waymo.
SOURCE: (*left* and *right*) Copyright 2019 The Associated Press. All rights reserved; (*middle*) iStock.com/LPETTET.

___________________

¹² SAE International, 2021, “Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles,” SAE J3016_202104, April 30, https://www.sae.org/standards/content/j3016_202104.

Page 22 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

Level 5 (full automation) represents complete autonomy, with vehicles capable of driving in all conditions without human input. These vehicles do not require traditional driver controls.

The current state of the art and most cars on the road today are Level 0 to Level 3 vehicles. Reaching higher levels of autonomy, particularly Levels 4 or 5, requires operating all car functions in environments that have not been previously mapped and that may contain novel obstacles, novel road conditions, and novel vehicles that may behave in novel ways. Such environments are said to be “open worlds” as opposed to “closed worlds,” which are environments that are fully mapped and modeled. Level 4 autonomy can place some restrictions such as daytime driving on dry freeways; Level 5 must operate without conditions. Reaching Level 4 or 5 autonomy critically requires techniques for perception and mapping that enable the car to have real-time situational awareness of its environment. It also requires the vehicle to detect when it is encountering novelty and is therefore uncertain about the behavior of novel vehicles, obstacles, and road conditions. In such cases, the control system must “hedge its bets” and take actions that increase the margin of safety (e.g., slow down, allow more space between the vehicle and the unknown objects).

The above case study highlights that one of the key reasons why ML is being considered in safety-critical settings is to allow them to operate systems with higher flexibility in open worlds. While self-driving vehicles have captured the imagination of innovators, researchers, regulators and the broader public, ML is being explored for use in many other safety-critical domains in Table 1-1. Chapter 2 provides more case studies that show the promise of ML in various safety-critical systems.

Challenges Posed by Machine Learning

Several inherent properties and known shortcomings of ML pose challenges for incorporating them in safety-critical systems.

First, ML approaches are inherently statistical. A complex model, such as a deep neural network computer vision system, is trained by algorithmically modifying millions or billions of numerical parameters in the neural network model with the goal of minimizing a “loss function.” The loss function measures the errors that the model is making on the training data, including failure to detect objects, incorrectly classifying the object type, positional error in localizing the object in space, and so on. It is rare for the learning algorithm to find a setting of the numerical parameters that achieves zero loss. And even if the network has zero loss on the training data, there is no guarantee that it will have zero loss on new images encountered after deployment. (There are statistical guarantees that bound the loss, but they make assumptions that are violated in practice.) The

Page 23 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

result is that the predictions output by the neural network are always approximate and uncertain. An ongoing research challenge is to quantify the uncertainty so that subsequent decision making can take it into account.

Second, a learned model cannot be trusted on inputs that lie outside the coverage of its training data. Furthermore, the training data from many sensors (e.g., video, lidar, radar) is very high-dimensional, which means the notion of “coverage” is hard to define, because every input is quite far away (e.g., in terms of raw pixels) from every other input. Another reason that coverage is hard to define is that many ML projects do not collect training data systematically but instead use data that was collected for other purposes. Without clear boundaries guiding the data collection, it is hard to define the coverage of the data.

This problem is exacerbated by the prevailing culture in the ML research community, which tends to seek general purpose solutions that work across many environments. In other words, a model’s space of environments (the possible conditions and contexts in which it is used) is only defined implicitly as the space spanned by the training data or the test data. This is best illustrated by the recent foundation models such as GPT or CLIP, which learn from extensive and diverse training data sets and can generate content over a seemingly endless topic space. This gives these ML systems a much broader scope than anything seen before, but at the same time their space of environments is harder to characterize from a safety perspective.

Third, given a new input data point, existing ML methods have difficulty deciding whether that data point is well covered by the training data. In open worlds, an ML-based system will inevitably encounter novelty and environmental change. Hence, safety-critical systems need a way to detect novelty and act appropriately. Ideally, in addition to taking safe immediate actions, the system will also capture the novelty, add it to the training data, and retrain the ML components.

Fourth, the internals of large ML models, such as deep neural networks, are difficult to understand. It is easy to see that the networks are extracting informative features from the images, but it is difficult to determine how those features are defined and circumscribed. This makes it impossible to, for example, inspect the model internals to detect coverage gaps and errors. An active area of ML research is to find ways of extracting causal explanations for network behavior from the millions (or billions) of model parameters.

Finally, ML models are vulnerable to adversarial attacks in which small changes to the image (or to the physical environment) cause large mistakes in the model predictions. Figure 1-5 shows how a small perturbation added to an image of a school bus causes the AlexNet deep network to misclassify it as an ostrich. Adversarial examples are possible because the high dimensionality of images makes it easy to create a large overall

Page 24 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

**FIGURE 1-5** An adversarial attack on AlexNet. The left image is correctly classified by AlexNet as a school bus. The center image shows the perturbation added to the left image (exaggerated by a factor of 10). The right image is incorrectly classified as an ostrich. The images appear identical to the human eye.
SOURCE: A. Pedraza, O. Deniz, and G. Bueno, 2022, “Really Natural Adversarial Examples,” *International Journal of Machine Learning and Cybernetics* 13:1065–1077, https://link.springer.com/article/10.1007/s13042-021-01435-0. CC BY 4.0.

perturbation by making small changes to many pixels. The perturbation can be designed to move the image across the “boundary” between school buses and ostriches that the neural network has learned. While progress has been made on preventing these simple perturbation-based adversarial attacks, more sophisticated attacks continue to be developed, and successful defense remains difficult.¹³

These properties make it difficult to integrate ML into existing safety engineering practice. Established safety processes for systems and software can effectively address general risks, but the integration of ML components will need a more specialized approach, as noted in recent technical reports.¹⁴ The statistical nature of ML means that, for example, perceptual systems based on computer vision will have significant error rates. On the ImageNet benchmark, an error rate of 5 percent is considered excellent performance, but this falls well short of the expectations of current safety-critical software requirements and validation techniques (IEC 61508-3) that enable critical subsystems to have failure rates below 10^–9.¹⁵ The safety gap between 95 and 99.99999999 percent is very large and essentially impossible to bridge through statistical learning alone. Consequently, safety-critical systems will need to be designed to make decisions under new expectations that include substantial levels of perceptual uncertainty.

The safety-critical community is skeptical of using AI models. The committee responsible for the development of many applicable international standards, ISO/IEC JTC1 SC 42, has reviewed the plausible use of ML in functional safety systems, acknowledges

___________________

¹³ A. Krizhevsky, I. Sutskever, and G.E. Hinton, 2012, “Imagenet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems 25.

¹⁴ ISO/IEC, 2024, “Artificial Intelligence – Functional Safety and AI Systems,” TR 5469.

¹⁵ A. Krizhevsky, I. Sutskever, and G.E. Hinton, 2012, “Imagenet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems 25.

Page 25 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

that ML systems and techniques are not directly considered by many current standards for safety-related systems, and some note that some ML applications are explicitly forbidden in some safety domain standards. Organizations responsible for these standards are just beginning to consider and to codify ML risk mitigation techniques. Owing in part to differences in development processes, the large body of knowledge on software safeguards and validation techniques against systematic failures for non-ML systems is not generally compatible with ML technologies that are not specification driven. This issue is further addressed in Chapter 3.

A Substantial Scientific Divide

The difference in engineering practice and culture between the ML and safety engineering disciplines leads to the most important finding of this report, which appears at the end of this section. This finding transcends and summarizes many other findings that can be found later in this report.

ML researchers appear ready to enter safety-critical domains without carefully examining all the risks, let alone demonstrating safety with respect to them. This has resulted in systems being deployed in public without ensuring the safety properties that can keep the public safe. This topic is discussed in Chapter 5, which focuses on societal considerations of ML in safety-critical domains.

Unfortunately, there is minimal interaction between the ML and safety engineering communities. It is of the utmost importance that a dialogue among these communities begins in scientific conferences, industrial bodies, regulatory bodies, standards development organizations, and universities. As articulated in Chapter 4, there are numerous research challenges that should be pursued jointly, and each discipline can bring its perspective to the combined agenda. These research challenges provide an initial roadmap that will make ML better suited for safety while also enabling safety engineers to safely embrace the power of ML.

Embodied Artificial Intelligence: A National Priority

To bridge the divide between the ML and safety-critical systems worlds, it is informative to look at past experience. At the turn of the 21st century, a cyber-physical systems initiative was launched in response to the need to integrate computing with physical systems to dramatically improve their performance, efficiency, and safety. This initiative had been urged by the federal Networking and Information Technology Research and Development (NITRD) Program of the National Science and Technology Council and the President’s Information Technology Advisory Committee, and it marked a turning point at which the successful integration of computational algorithms with physical systems and processes became a national priority. Subsequent cyber-physical systems programs,

Page 26 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

orchestrated by funding agencies, including the National Science Foundation (NSF) and the Defense Advanced Research Projects Agency (DARPA), together with application-oriented agencies, such as the Department of Transportation (DoT), the U.S. Department of Agriculture, the National Institutes of Health (NIH), and the National Institute of Standards and Technology (NIST), have had a profound impact across diverse sectors, including health care, transportation, agriculture, and manufacturing both nationally and globally.

Today, a new computing revolution is being driven by advancements in AI technologies. AI is rapidly becoming a key computational paradigm and has already begun reshaping industries and societal norms. In the spirit of the 21st-century initiative that birthed the cyber-physical system program, it is time to launch a similarly bold effort to integrate AI across myriad applications where AI-empowered systems interact with the physical world. Such systems can be described as “embodied AI,”¹⁶ which is defined as “systems which engage in a purposeful exchange of energy and information with a physical environment.”^17,18 Research in embodied AI generally has lagged AI applications in such areas as information retrieval, language processing, and gaming. A unified effort in embodied AI would bring together researchers in AI, ML, computer vision, and safety-critical systems to create a fundamentally new level of capabilities for safety-critical systems.

As with cyber-physical systems, realizing the full potential of embodied AI will once again require transcending disciplinary boundaries, involving academia, industry, and government agencies in a coordinated effort to shape the future of AI deployment in the physical world. Ensuring the safety of AI in the physical world involves addressing challenges unique to embodied systems where real-world consequences can have immediate and tangible impacts. Robust safety measures must underlie the development and deployment of AI in the context of such systems, considering factors such as unpredictability, variability, and the potential for unforeseen interactions. This requires a comprehensive approach that incorporates not only technological advancements but also ethical considerations, regulatory frameworks, and interdisciplinary collaboration.

Finding 1-1: Integration of ML in safety-critical cyber-physical systems exposes a fundamental challenge—the ML and safety engineering communities have different methods, standards, and cultures that must be reconciled if the performance and safety potential of ML is to be realized.

___________________

¹⁶ While this report focuses on machine learning, this section uses “artificial intelligence” and “AI” to be consistent with the term “embodied AI,” which is already familiar in the community.

¹⁷ N. Roy, I. Posner, T. Barfoot, et al., 2021, “From Machine Learning to Robotics: Challenges and Opportunities for Embodied Intelligence,” 2110.15245.pdf (arxiv.org).

¹⁸ R. Firoozi, J. Tucker, S. Tian, et al., 2023, “Foundation Models in Robotics: Applications, Challenges and the Future,” The International Journal of Robotics Research, https://doi.org/10.1177/02783649241281508.

Page 27 Cite Bookmark

Suggested Citation: "1 Engineering Safety-Critical Systems in the Age of Machine Learning." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

It is imperative that the realization of this emerging embodied paradigm fundamentally rests on the pillars of safety, reliability, and trustworthiness. To ensure the successful integration of AI across systems and applications that affect the physical world, accelerated scientific research and exploratory development at the intersection of AI and safety is needed. This report aims to provide an initial roadmap for this transformation by articulating foundational principles and critical considerations for navigating the early stages of AI, specifically ML, in the physical world. Nevertheless, it is crucial to recognize that the path ahead demands a more comprehensive and inclusive dialogue to address the multifaceted challenges we face.