Making Machine Learning Safer in High-Stakes Settings
Feature Story
By Sara Frueh
Last update October, 20 2025
Aerial view of a bright orange car driving on a street. The car is next to a pedestrian crossing and in front of a stop line with the word "Stop" painted on the road. The car uses sensors and radar to detect the pedestrians. Concept of new car technology like autonomous driving.
Machine learning, a key enabler of artificial intelligence, is increasingly used for applications like self-driving cars, medical devices, and advanced robots that work near humans — all contexts where safety is of critical importance.
Engineers anticipate that machine learning will enable new capabilities and efficiencies in these systems, but it also introduces risks. An autonomous vehicle may misperceive and strike a pedestrian, for example, or an autonomous robot in a factory might mistake a worker for an object to be moved.
Right now, machine learning can’t perform at the level of reliability expected in safety-critical applications, said George Pappas, associate dean of research at the School of Engineering and Applied Science at the University of Pennsylvania.
“There is a critical gap between the performance of machine learning [and] what we would expect in a safety-critical system,” Pappas said. “In the machine learning community … people may be happy with a performance of 95 or 97 percent. In safety-critical systems, we’d like errors of 10¯⁹” — in other words, a system that is error-free 99.9999999 percent of the time.
Pappas chaired a National Academies study committee that wrote a report on how to bridge that gap by improving the safety of both machine learning components and the systems in which they’re embedded. Along with two fellow committee members, he spoke at a recent webinar that discussed the new report.
Improving performance and building guardrails
Machine learning components learn by detecting patterns in training data and using that “knowledge” to inform system actions or decisions. Incomplete training data or incomplete sensing of the environment can leave systems unable to accurately perceive situations they encounter, with potentially harmful consequences.
One current problem is that training data are often collected as a side effect of ordinary business processes, divorced from the setting in which the technology will ultimately be used, said Thomas Dietterich, distinguished professor emeritus in the School of Electrical Engineering and Computer Science at Oregon State University.
Instead, the data should be gathered with a strong focus on covering the actual environment in which the system is expected to operate, he said. That includes data reflecting the full range of real-world conditions — such as variations in weather and lighting, for example — that the system will encounter when it’s deployed.
Another challenge for machine learning is novelty: when systems meet something new that their training data hasn’t equipped them to identify. A self-driving car may encounter an animal or transportation device it has never seen before, for instance.
“Machine learning systems tend to fail when they encounter novelty,” Dietterich said. “We need an outer loop that can detect and characterize novelties when they occur, and then we need processes in place, both automated and human organizational processes, to collect additional data and retrain and revalidate the system to ensure that it’s properly handling the discovered novelties.”
Even with improvements in data and learning processes, machine learning assessments inherently have a degree of uncertainty, and safety-critical systems need to be designed with that expectation in mind, the report says. For example, a machine learning component in a self-driving car should be able to indicate “my uncertainty is high,” and the car should slow down, take more pictures, and gather more data to learn more about the situation.
Machine learning systems will also make mistakes — such as failing to detect an obstacle or overestimating the distance between one vehicle and another — and the system design must include redundancy and guardrails to detect when a mistake is likely to be happening and switch to a more reliable backup strategy, said Dietterich.
“We need to make architectural changes in our safety-critical systems to mitigate or deal with the potential shortcomings of machine learning components,” he said.
Pappas, too, stressed the need for such measures. “Every time we use machine learning in safety-critical settings, we need to develop safety filters, guardrails,” he said. If a car or robot misclassifies someone, there should be safeguards that can prevent an accident. “That is the most immediate challenge that we face,” he said.
The report also emphasizes the importance of new standards, regulations, and testing methods to address safety challenges in these systems and protect public safety and trust.
“Guidance for governing machine learning in safety-critical settings has made a lot of progress, but these measures so far are inadequate to accomplish what’s necessary for the future,” said Jonathan How, Ford Professor of Engineering at the Massachusetts Institute of Technology.
How also stressed the importance of transparency — both in technical analysis and in reporting of safety incidents, including near misses — and the need to learn from these incidents. “It’s essential to building and maintaining trust with the community,” he said.
Bridging disciplines and advancing education
Improving the reliability of safety-critical systems that use machine learning will mean bridging disciplines and developing new scientific approaches, the report says. Currently, the R&D communities for safety-critical systems and machine learning differ in their norms and standards, governance approaches, and culture.
“In the machine learning community, the focus has generally been on the average-case accuracy of the system, whereas in safety-critical systems we’re worried about the worst-case accuracy, particularly in areas of high risk, where we want to be absolutely sure we don’t collide with a pedestrian or a human worker,” said Dietterich.
The communities also differ in their design philosophies, noted Pappas. “In the safety-critical community, we restrict the domain to achieve a certain safety performance, whereas in machine learning the goal is to develop a model that generalizes across all environments.”
“Bringing those two design philosophies together will require a new engineering discipline … building a new scientific community that thinks on both sides of the coin to generate products that are safe and principles that lead to safer engineering systems,” Pappas continued.
Education of the workforce is important to this goal, How said. “A focused effort is needed to educate the next generation of researchers and engineers on how to build these machine learning-enabled safety-critical systems.”
This effort should extend beyond graduation, he added. “Industry, recognizing that this is an important challenge, could further educate the engineers they already have in this field to become further aware of … safety regulations and how to embed those into the thinking processes for developing these machine learning capabilities.”
Related Resources
Read the report and interactive overview