Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda (2025)

Chapter: 5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components

Previous Chapter: 4 A Research Agenda to Bridge Machine Learning and Safety Engineering
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

5

Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components

This report does not advocate for the public to broadly trust safety-critical systems with machine learning (ML) components. Instead, the committee believes that building public trust means keeping the public informed of the risks and benefits of ML technologies. Standards and regulations also play an important role in supporting safety and building user confidence. The previous chapters explored risks and benefits; this chapter explores the challenges and complexity of standards and regulations for ML-enabled systems.

5.1 DEVELOPING STANDARDS AND REGULATIONS FOR MACHINE LEARNING

As referenced in Chapter 2, application of ML has a wide range of potential benefits that can positively influence numerous aspects of society and economy. ML is not simply a new tool or technology. Its advantages, some realized, some anticipated, have the potential to radically transform industries and reshape human interaction with machines in everyday life. However, some elements, such as a lack of transparency, security, bias and even existential risks continue to draw as much attention to conceivable harms as it does to the anticipated benefits. ML brings a level of uncertainty requiring new approaches and considerations.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

The use of ML in a safety-critical system will initially leverage existing risk management techniques, such as the National Institute of Standards and Technology’s Artificial Intelligence Risk Management Framework,1 that are normally applied to safety systems and software applications. However, such techniques may not be fully sufficient to address these new risks when ML is used as a safety system component. The necessary level of confidence with the integration of new ML techniques and applications will require new or additional confidence mechanisms for its safe use.

Governments often establish regulations to protect security, health and safety, property, or the environment and to promote public trust. The anticipated applications of ML cross into many of these regulated interests and continue to drive new and broad policy and regulatory debate. The situation can be more complex when calls for new “horizontal” ML regulation (i.e., regulations that apply to the use of ML across application domains) intersect with long-established regulatory and compliance methods intended to ensure safety in specific domains such as industrial robotics, medicine, automotive, or aerospace.

The interplay of standards and regulations with rapidly evolving technologies is complex. Standards and regulations, although often interconnected, fulfill distinct roles in the governance of technology and safety. A critical balance must be struck between

  • Legally binding regulations that encompass essential requirements and recognized confidence mechanisms and
  • Consensus-driven, voluntary standards that codify specific requirements, testing methodologies, and industry best practices.

While standards generally offer more flexibility and faster updates compared to regulations, neither process is particularly rapid or efficient—especially when international harmonization or alignment along common timelines is needed. To mitigate the challenges posed by innovation-driven regulatory and harmonization lag, more agile and responsive mechanisms—such as guidance documents, frameworks for essential requirements, or sandbox and agile development methodologies—can provide interim solutions while more robust, harmonized approaches are being developed.

Finding 5-1: Although guidance for governing ML in safety-critical systems has progressed, new standards, regulations, and testing methods are needed to address both cross-cutting and domain-specific safety challenges.

___________________

1 National Institute of Standards and Technology, 2023, “Artificial Intelligence Risk Management Framework (AI RMF 1.0),” https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

Safety-Critical System Regulation and Standards

Many applications that rely on safety-critical systems are currently subject to some form of government regulatory oversight and liability regimes. As a result, each sector has established risk management standards, regulatory models, and defined confidence mechanisms (e.g., required minimum oversight, compliance, or certification obligations) that are intended to build and maintain trust in both the regulatory system and to ensure its practical application to guarantee predictable safety outcomes.

In the United States, independent safety regulators often have legislatively assigned obligations for safety regulating within a domain or sector of responsibility. Some examples include the following:

  • The Occupational Safety and Health Administration (OSHA) is responsible for developing regulations on the health, safety, and well-being of workers and sets regulatory obligations under U.S. CFR 29 Part 1910, which includes specific standards, guidelines, and requirements that address safety-critical systems for robotics in the workplace.2
  • The National Highway Traffic Safety Administration (NHTSA) sets standards for motor vehicle safety through a series of Federal Motor Vehicle Safety Standards (FMVSS) under U.S. CFR 49 Parts 500–599.3 These requirements address specific safety-critical systems needed for the safe operation and use of motor vehicles.
  • The U.S. Food and Drug Administration (FDA) has regulatory responsibility for medical devices safety, described in CFR 21 Parts 800–899, that include specific minimum standards for design, operation, and system safety.4

In the European Union (EU), a similar, sector-specific regulatory structure is promulgated by the European Union through its direction to its member states. EU regulations and directives set “essential requirements” for each sector that are to be subsequently adopted by each member state’s national legislative and regulatory systems. For example,

___________________

2 Code of Federal Regulations, 1975, “Part 1910—Occupational Safety and Health Standards (OSHA),” Title 29 CFR Part 1910, https://www.ecfr.gov/current/title-29/subtitle-B/chapter-XVII/part-1910.

3 Code of Federal Regulations, “Title 49 CFR Parts 500–599,” National Highway Traffic Safety Administration, Department of Transportation, https://www.ecfr.gov/current/title-49/subtitle-B/chapter-V.

4 Title 21 CFR Subchapter H, Parts 800–899, Medical Devices-General, https://www.ecfr.gov/current/title-21/chapter-I/subchapter-H/part-800.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
  • Workplace safety of robotics and machine safety is addressed by the Machinery Regulation 2023/2030.5
  • Medical device safety is included in the Medical Device Regulation 2017/745.6

Each sector-specific (or more general) safety regulation also addresses a European Commission mandate for harmonized European standards development that is intended to ensure common and effective application of requirements community wide.7 This legislative and standard’s framework is built on the new General Product Safety Regulation, which was recently updated to address artificial intelligence (AI) use across many industries, and the proposed AI Liability Directive.8,9

Like most countries, both U.S. and EU regulatory models for safety-critical systems have sector-specific approaches that set both essential safety requirements and include national or regional standards to better establish accepted risk management approaches that address safety system design, deployment, and operation. These standards and guidelines can be normative, voluntarily applied for development of application safety cases or simply best practices. Each regulator will have its own unique safety or risk management process codified in these standards. However, given that the risk to persons or property being addressed is globally relevant, these risk management and safety standards are frequently based on international standards that can include specific national requirements where necessary. For example, international risk and hazard management standards such as ISO 26262 (Road Vehicles–Functional Safety),10 ISO 14971 (Medical Devices–Application of Risk Management to Medical Devices), IEC 62304 (Medical Device Software), and IEC 62061 (Safety of Machinery–Functional Safety of Programmable Devices) or IEC 60730 (Automatic Electronic Controls) often form the basis of national or regional compliance standards that underpin many country’s regulations for domains that include safety-critical systems.

___________________

5 European Union (EU), 2023, “Regulation (EU) 2023/1230 of the European Parliament and of the Council of 14 June 2023 on Machinery,” Official Journal of the European Union L 165/1, https://eur-lex.europa.eu/eli/reg/2023/1230/oj.

6 EU, 2017, “Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on Medical Devices,” Official Journal of the European Union L 117/1, https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=uriserv:OJ.L_.2017.117.01.0001.01.ENG.

7 European Commission, 2022, “Draft Standardization Request to the European Standardization Organizations in Support of Safe and Trustworthy Artificial Intelligence,” DG Grow.H.3, Art 12 on Regulations N. 1025/2012, https://ec.europa.eu/docsroom/documents/52376.

8 EU, 2023, “Regulation (EU) 2023/988 of the European Parliament and of the Council of 10 May 2023 on General Product Safety,” Official Journal of the European Union L 135/1, https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A52022PC0496.

9 EU, 2022, “Proposal for a Directive of the European Parliament and of the Council on Adapting Non-Contractual Civil Liability Rules to Artificial Intelligence (AI Liability Directive),” Comm 2022 496 European Commission 2022/0303, https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex:52022PC0496.

10 International Standard, 2018, “ISO 26262 – 1 Road Vehicles – Functional Safety,” International Standard 2, https://www.iso.org/standard/68383.html.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

Within diverse regulatory systems and risk management approaches, the point at which regulators choose to engage the developer, integrator, or user of a safety system can also vary by sector. The confidence mechanisms, certification, and reporting requirements are unique to the sector. Illustrating this point, the Federal Aviation Administration (FAA) process for aircraft authorizations generally requires involvement with the regulatory authority early in the development of a safety system. In contrast, for automotive applications, NHTSA is more often engaged near the end of the development process. Confidence mechanisms, use of first- or third-party assessments, software update controls, and compliance processes are as diverse as the standards and requirements they support.

Safety-critical systems, by their nature, also require an additional level of rigor to address system design changes or software updates needed to maintain the effective performance of the safety system throughout their deployment lifetime. Most domain risk management standards and regulations mandate change management and version control that obligate the system manufacture or operator to ensure ongoing compliance management and testing and re-certification of the system when changes are made. The maintenance of ML components and their associated training models will necessitate an expansion of the existing requirements for post-deployment or field updates of critical software and hardware to consider new risks related to ML. These might include requirement processes to address ongoing model updates and expected changes to the operational environment or safety specification over time.

Finding 5-2: Governments have made meaningful progress developing guidance for AI/ML. However, these measures alone cannot adequately address the stringent requirements of safety-critical systems.

Horizontal Artificial Intelligence Regulatory Approaches

Both the European Union and the United States have considered AI regulations that are horizontal (cross-cutting) across all application sectors. These frameworks will have impacts well beyond safety-critical systems, directly affecting existing national standards and safety regulation used today.

Regulation on AI and more specifically AI safety in the Organisation for Economic Co-operation and Development’s G20 economies is highly dynamic as all stakeholders grapple with the potential risks of AI very broadly. A 2023 report on AI regulatory landscapes11 highlighted that a risk-based approach to AI deployment is a priority within the

___________________

11 N.M. Bianzino, M.-L. Delarue, S. Maher, A. Koene, K. Kummer, and F. Hassan-Szlamka, 2023, “The Artificial Intelligence (AI) Global Regulatory Landscape,” Ernst & Young Global Limited, https://www.ey.com/content/dam/ey-unified-site/ey-com/en-gl/insights/ai/documents/ey-the-artificial-intelligence-ai-global-regulatory-landscape-final.pdf.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

G7 economies. It goes on to characterize the necessary balance of regulation and guidance that frames an economic approach to dealing with AI risks, whereby the AI applications considered high-risk have more onerous compliance expectations than those that pose little risk. Applications considered of sufficiently high risk will be subject to these horizontal regulatory versus guidance approaches in similar but uniquely national ways.

On the other hand, instead of horizontal frameworks, many AI stakeholders believe that regulations should target only extreme risks.12 The Biden administration moved toward horizontal regulation in an executive order that was rescinded by the Trump administration in 2025.

Europe’s more prescriptive product-centered approach includes proposals for both a horizontal EU AI Act, first introduced in 2021 and a draft AI Liability directive in 2022. The EU AI ACT is a risk-based, top down, product- or application-focused regulation that is modeled and aligned with the community’s existing safety regulations.13 The AI ACT establishes high level, essential requirements along with formal risk determination and conformity assessment frameworks that are broadly consistent with present compliance frameworks. A coordinated revision of sector-specific safety regulations and technical standards is expected over the next few years.14,15,16

The speed and introduction of innovative ML solutions can lead to deployed solutions where trust in the performance of the system is presumed until there are demonstrated failures. In lower-risk sectors, demonstration systems or testimonials are commonly considered a suitable confidence mechanism to achieve the trust and confidence in either prescribed regulatory measures or in the evidence required to comply. ML applications used in medical imaging or diagnosis are now common where they provide advice to a human medical practitioner that remains responsible for the patient’s safety. However, this approach is generally inconsistent with many safety-critical system approaches, where compliance should be assured without human intervention and be demonstrated prior to the safety system’s deployment.

Countries such as Singapore and the European Union are beginning to use AI development and regulatory sandboxes that can address both innovation’s desires for rapid

___________________

12 B. Park, 2023, “The World Wants to Regulate AI, But Does Not Quite Know How,” The Economist, https://www.economist.com/business/2023/10/24/the-world-wants-to-regulate-ai-but-does-not-quite-know-how.

13 A. Satariano, 2023, “E.U. Agrees on Landmark Artificial Intelligence Rules,” New York Times, December 8, https://www.nytimes.com/2023/12/08/technology/eu-ai-act-regulation.html.

14 EU, “European Approach to Artificial Intelligence,” https://digital-strategy.ec.europa.eu/en/policies/european-approach-artificial-intelligence.

15 European Commission, 2021, “Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (ARTIFICIAL INTELLIGENCE ACT) and Amending Certain Union Legislative Acts,” Com(2021 206) 2021.0106, https://eur-lex.europa.eu/resource.html?uri=cellar:e0649735-a372-11eb-9585-01aa75ed71a1.0001.02/DOC_1&format=PDF.

16 EU, 2022, “Proposal for a Directive of the European Parliament and of the Council on Adapting Non-Contractual Civil Liability Rules to Artificial Intelligence (AI Liability Directive),” Comm 2022 496 European Commission 2022/0303, https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex:52022PC0496.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

AI development and the lack of development and user experience with new AI technologies.17 These “sandboxes,” which are common in software application development, can enable the testing of AI solutions in a supervised environment with a view toward developing more effective regulation or guidance needed before broader public deployment.

Consideration of new requirements, standards, and risk mitigation approaches has already started in safety-critical domains that are likely to leverage ML components to achieve good safety outcomes. The EU AI Act directly charged sector-specific regulators and regulations to address these new risks. These domains already have tools and compliance management processes that address the use of software in safety systems that are considered adequate for some types of AI/ML but may not be adequate for generative AI components (see Chapter 3).

Finding 5-3: Harmonizing cross-cutting (horizontal) AI/ML standards with domain-specific (vertical) safety-critical system standards is essential for integrating ML components into safety systems effectively.

Furthermore, recognizing that there are many possible application domains and a multitude of regulating bodies, the committee believes that there is a change in mindset required to keep the public safe. This should transition away from a model that investigates safety after accidents happen, and more toward one in which manufacturers should demonstrate and/or prove that their system is sufficiently safe to operate before it is allowed to be in broad use. Similarly, the equivalent of institutional review boards (IRBs) that are used by U.S. government agencies to provide ethical and regulatory oversight of research involving human subjects should also be used more broadly.18 Experiment plans must typically be submitted and reviewed by an IRB prior to the experiment and then frequently re-reviewed as the experimental process evolves. IRB policies also require that the researchers involved in any study that involves human subjects must provide certification of the completion of an education program in the protection of human subjects. The intent is to have a similar degree of pre-planning and oversight for groups that want to deploy ML in safety-critical systems to ensure that appropriate level of risk has been analyzed, and appropriate safety measures are in place.

___________________

17 N.M. Bianzino, M.-L. Delarue, S. Maher, A. Koene, K. Kummer, and F. Hassan-Szlamka, 2023, “The Artificial Intelligence (AI) Global Regulatory Landscape,” Ernst & Young Global Limited, https://www.ey.com/content/dam/ey-unified-site/ey-com/en-gl/insights/ai/documents/ey-the-artificial-intelligence-ai-global-regulatory-landscape-final.pdf.

18 U.S. Food and Drug Administration, 2019, “Institutional Review Boards (IRBs) and Protection of Human Subjects in Clinical Trials,” https://www.fda.gov/about-fda/center-drug-evaluation-and-research-cder/institutional-review-boards-irbs-and-protection-human-subjects-clinical-trials&sa=D&source=docs&ust=1703102339612915&usg=AOvVaw2QPdsXeawhQy_SyHVJaQQl.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

Standards Development

Given the public, industry and policy interests in AI safety, AI standards and standardization projects are continuing to proliferate in consortia, industry groups and, more formally, with governments and international standardizations bodies. As described above and in Chapter 3, unique domain-specific regulations will leverage domain-specific, vertical standards and guidelines to establish the necessary risk management, minimum requirements, and compliance assessment needed for the sector. There are dozens of unique national, regional, or international standards for software and system safety already employed in these domains, and a few have begun to consider the safety and reliability of AI components in safety systems. Some of the most common safety standards developing safeguards for AI/ML applications include the following:

  • IEC 61508: Functional Safety of Electrical, Electronic, and Programmable Electronic Safety-Related Systems
  • ISO 26262: Road Vehicles—Functional Safety
  • ISO 21448: Road Vehicles—Safety of the Intended Functionality
  • DO-178C: Software Considerations in Airborne Systems and Equipment Certification
  • EN 50128: Railway Applications—Communication, Signaling and Processing Systems
  • UL 4600: Evaluation of Autonomous Products

Beyond the important work already being developed for the Road Vehicle domain, the more impactful standards development work to address ML is likely to be directly associated with new horizontal AI/ML regulation that will influence domain-specific safety standards in the future.

The evolving national regulatory and guidance frameworks for AI prioritize safety and compliance, providing direction for future standards development. One of the key drivers, both internationally and in the United States, is the Department of Commerce (NIST) request for information (RFI) in response to an executive order seeking public input to begin work that will advance evaluation, measurement, and trust of AI technologies.19 NIST has been given the responsibility to develop guidelines, standards, and best practices for AI safety and security, aiming for globally relevant, consensus standards. As part of their public initiative, NIST has formed a new public consortium that will work

___________________

19 NIST, 2023, “Docket Number: 231218-0309, RIN:0693-XC135: Request for Information (RFI) Related to NIST’s Assignments Under Sections 4.1, 4.5, and 11 of the Executive Order Concerning Artificial Intelligence (Sections 4.1, 4.5, and 11),” https://www.federalregister.gov/documents/2023/12/21/2023-28232/request-for-information-rfi-related-to-nists-assignments-under-sections-41-45-and-11-of-the.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

along with NIST’s AI Safety Institute with the expectation to develop innovative methods for evaluation AI systems and that build confidence in AI safety and trustworthiness.20

The EU AI Act, like most community technical regulation, is supported by Commission standardization mandates to European standards development organizations (SDOs) including CEN and CENELEC. These organizations are already working with the draft AI standardization mandate to establish harmonized European standards for AI systems, ensuring safety and fostering global cooperation. The joint European Technical Committee CEN/CENELEC JTC21 is responsible for AI standards addressing aspects such as risk management, post-market monitoring, and conformity assessment expected to support the European Union’s new AI regulatory obligations.

In most regions, national SDOs are proactively working with standards development in international bodies such as ISO/IEC JTC1 SC42. The ongoing work in SC42 includes more than 50 proposals across the AI spectrum that include new standards for AI risk and management systems, trustworthiness, robustness, bias, data integrity, and more recently, AI conformity assessment systems. This work includes two standards projects that are expected to broadly and directly impact the use of ML in safety-critical systems:

  • ISO/IEC TR5469—AI, functional safety, and AI systems was published in early 2024.21 This technical report identifies the most important attributes and challenges when using AI for functional safety as a component in safety-critical systems.
  • ISO/IEC TS 22440 (Draft) Functional Safety and AI Systems-Requirements is a project in a new joint working Group (SC42a JWG4) formed in late 2023. This working group is expected to develop a new horizontal technical specification that establishes essential requirements and validation methods for the use of AI in functional safety applications. This document will be foundational for risk management and future safety standards for many safety-critical sectors including Programmable Electronic Controls (IEC 61508), Road Vehicles (ISO 26262), Industrial Machinery (ISO 62061), and Industrial Process Controls (IEC 61511).

Finding 5-4: Consensus standards efforts such as ISO/IEC JTC1 SC42, NIST’s Artificial Intelligence Safety Institute, and the regional standards work in CEN/CENELEC JTC21 are important steps toward engineering safety-critical systems with ML components.

___________________

20 NIST, “Artificial Intelligence Safety Institute Consortium,” https://www.nist.gov/artificial-intelligence/artificial-intelligence-safety-institute.

21 ISO/IEC, 2024, “Artificial Intelligence, Functional Safety and AI Systems,” ISO/IEC TR 5469:2024, https://www.iso.org/standard/81283.html.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

While the traditional model of comprehensive, harmonized, sector-specific regulations remains important for safety-critical systems, the immediate future will likely see a more diverse and agile regulatory and standards ecosystem. This approach aims to balance the need for innovation with the imperative of public safety in a rapidly changing technological landscape.

5.2 SAFE DEPLOYMENT

Benchmarks, Testbeds, and Red-Teaming

The committee believes that there is a need for more widespread use of formalized testing processes. Challenge problems and benchmark data sets are frequently used in academia to compare algorithm performance in published papers, and similar frameworks should be created to formalize the analysis of deployed systems. For example, for self-driving cars this could be in the form of the equivalent of a driver’s test with a set of skills that need to be demonstrated to a qualified observer in a live setting under various lighting and weather conditions. This intuitive analogy has a counterpart in test suites defined by the United Nations committee for international harmonization of type approval requirements for advanced driver assistance systems—for example, UNICE R157 on testing requirements for automatic lane keeping systems (ALKS). Such regulations are formalizing the intuition of a “driving test” through standardized sets of scenarios that an ALKS-equipped vehicle must pass to get type approval. Since many of the stressing (“corner”) cases may require specific configurations/interactions, specialized test sites should be created to perform these demonstrations. For example, test sites such as M-city could be utilized for autonomous vehicle (AV), and the unmanned aerial system (UAS) test site facilities could be utilized to test autonomous air mobility concepts.22,23 However, given the extreme complexity of variations in environmental conditions that must be covered, it is impossible to achieve a sufficient degree of coverage with test fields alone. These physical test-facilities must be complemented by virtual test environments, requiring the capability to build highly faithful digital twins of both the system under test (such as a self-driving car) and its environment. Test fields are then particular relevant for establishing that models of components of the system under test are faithfully representing the actual component and its interaction with the environment, such as a model of a radar system faithfully modeling all artifacts of radar systems such as ghost objects resulting from reflections of radar systems. For ML-based systems, such digital

___________________

22 University of Michigan, “University of Michigan City,” https://mcity.umich.edu, accessed October 4, 2025.

23 Federal Aviation Administration, 2025, “UAS Test Site Locations,” January 24, https://www.faa.gov/uas/programs_partnerships/test_sites/locations.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

twins are a prerequisite—for example, testing robustness of ML-based components under perturbations such as changing weather conditions, distribution shifts, or rare events. Similar test scenarios, in hardware or simulation, could be created in other domains to analyze ML safety and performance.

Oversight by a publicly funded and independent entity with regular, published/updated performance and safety analysis would provide the public with critical information (similar to the regularly published crash test data). This requires regulation that safety-critical systems based on ML technology regularly report to such an independent entity covering the specific application domain any incidents (crashes and near misses) caused by misinterpretation of the actual environment of the deployed system, akin to processes well established in the civil avionics domain by the FAA. A technical oversight committee with experts in the field could regularly update, randomize, and modify the test scenarios to ensure that they are representative of recently detected failures and that the ML systems are not over-designed to succeed for specific cases. The intent here is to analyze the safety of systems beyond the minimum published standards. The published results would then help bridge the gap between advancing the research analysis and increasing the public trust in the safety of ML.

Finding 5-5: Implementing processes for learning from in-field incidents is essential for maintaining trust in ML-based safety-critical systems. These processes include analysis of causes for such incidents by an independent body, and processes enforcing upgrades of deployed systems to eliminate the causes of such incidents.

Finding 5-6: Faithful testing capabilities are needed to ensure the safe deployment of ML technologies in safety-critical applications. This will involve the development of representative testing grounds and highly faithful digital twins of such systems and their environment.

Given the reliance of ML on data, it is important that the public be made aware of the data sources used to train the ML that has been deployed. This suggests a new paradigm (beyond open source) of increased openness/sharing in the data sets, software, or models used to create these ML systems, with an emphasis on corporate transparency.

In addition to formalized testing prior to system deployment, it is critically important that deployed systems with integrated ML components are continuously assessed for safety conformity after deployment. It is well known that the performance and behavior of ML models can change or degrade based on changes in environmental,

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

contextual, or other factors once deployed in the real world.24 Standards for post-deployment monitoring and mitigation of changes in system performance will establish best practices and mechanisms to ensure that systems continue to operate within their established safety operating parameters. This should be realized through engineered monitoring mechanisms integrated into the systems combined with periodic reevaluation of systems by auditors.

5.3 EDUCATION

Bridging the gap between ML and safety-critical systems communities is crucial for the secure and reliable integration of ML in safety-critical domains. The mentioned differences, including scientific, community norms, regulatory, and cultural distinctions, highlight the complexity of this challenge.

Addressing scientific disparities involves establishing common ground and understanding between ML experts and safety-critical systems specialists through collaboration and interdisciplinary research. Mutual understanding and the development of shared best practices can be promoted by organizing joint conferences and creating forums for cross-community discussions to address varying community norms. There is a need to create bridge conferences, meetings, and workshops between ML researchers, that typically attend Conference and Workshop on Neural Information Processing Systems, the International Conference on Machine Learning, and the International Conference on Learning Representations, with more domain-specific conferences across various sectors. Recently, events like the Conference on Robot Learning, Learning for Dynamics and Control, and the Conference on Secure and Trustworthy Machine Learning, among others, provide useful intellectual spaces for researchers to understand the different language and different perspectives of these fields.

Harmonizing ML and safety regulations is essential, considering the stringent regulatory standards in safety-critical systems. This involves ensuring that safety standards are met while allowing for the innovative integration of ML technologies. Cultural distinctions, whether in terms of organizational culture or broader societal values, can impact the adoption and acceptance of ML in safety-critical systems. Efforts to build cultural bridges, foster dialogue, and promote education can help align perspectives and facilitate smoother integration.

Educational initiatives play a pivotal role in retraining the current and preparing the next generation to bridge this gap while ensuring the safety of ML-enabled systems.

___________________

24 D. Vela, A. Sharp, R. Zhang, T. Nguyen, A. Hoang, and O.S. Pianykh, 2022, “Temporal Quality Degradation in AI Models,” Scientific Reports 12(11654).

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

Practitioners can contribute through mentorship programs and industry collaborations. Introducing new courses and educational programs covering the intersection of ML and safety-critical systems equips students with interdisciplinary skills for effective collaboration. This educational approach ensures that future professionals understand both the technical intricacies of ML and the safety considerations crucial for applications in critical domains. In safety-critical settings, new courses on safe autonomy and trustworthy AI are emerging in the country, but a broader effort is needed.

The committee believes that there is a need for significant emphasis on AI/ML education. This includes public education on the benefits, risks, and possible limitations of ML to avoid overconfidence in ML capabilities. It also includes teaching the importance of systems and safety engineering in computer science departments and AI/ML programs, with further education in the workforce to reduce the culture gap over time.

The intent is to ensure that all researchers, developers, and engineers deploying ML learn about the importance of safety standards and methodologies in practice. This also includes enhancing the competence of government officials/regulators who will be determining and enforcing these rules (“smart customer”).25

A holistic strategy that combines collaborative research, community engagement, regulatory alignment, cultural understanding, and robust educational initiatives is essential for creating a workforce capable of safely integrating ML in safety-critical systems. This multifaceted approach addresses current disparities and lays the foundation for a sustainable and secure future in ML-enabled technologies.

Finding 5-7: To educate the next generation of researchers and engineers on how to build ML-enabled safety-critical systems, graduate-level courses and curricula are needed that emphasize a holistic systems perspective, building on and integrating competencies on ML design, information technology design, systems safety and security, and human–machine cooperation, among others.

___________________

25 P.E. Ross, 2023, “A Former Pilot on Why Autonomous Vehicles Are So Risky: 5 Questions for Transportation-Safety Expert Missy Cummings,” IEEE Spectrum, May 13, https://spectrum.ieee.org/transformer-crisis.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.

This page intentionally left blank.

Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 73
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 74
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 75
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 76
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 77
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 78
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 79
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 80
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 81
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 82
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 83
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 84
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 85
Suggested Citation: "5 Societal Considerations to Build Public Understanding and Confidence in Safety-Critical Systems with Machine Learning Components." National Academies of Sciences, Engineering, and Medicine. 2025. Machine Learning for Safety-Critical Applications: Opportunities, Challenges, and a Research Agenda. Washington, DC: The National Academies Press. doi: 10.17226/27970.
Page 86
Next Chapter: Appendix A: Statement of Task
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.