Previous Chapter: Front Matter
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Artificial Intelligence and Big Data to Enhance Safety Analysis: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29098.

CHAPTER 1
Introduction

Intended Audience

This user guide is for transportation safety professionals who are interested in applying new and innovative datasets and analysis methodologies in safety investigations. Researchers, data vendors, and technology vendors may also find this user guide helpful.

Importance of Transportation Safety

The need to improve road safety performance for all users is clear, and the optimization of investment by local and state agencies to maximize lives saved and injuries reduced takes on even greater importance when financial resources are constrained. For example, unlocking the broader sustainable benefits that come from active transportation modes also requires an understanding of infrastructure safety performance. The lack of infrastructure-based sensors, low-cost data, safety performance metrics, and prioritized investment options make it difficult for agencies to understand the business case for safer roads and to measure progress.

This research has investigated the use of artificial intelligence (AI), machine learning (ML) and big data to provide the information needed to power key data-driven, public, and proprietary safety analysis tools as well as predictive and other systemic safety tools. The availability of large-scale, high-resolution, and consistently collected data across the entire road network will improve the visibility of existing network conditions with a focus on road and exposure features influencing the safety of all road users. This consistent data, collected with a cost-benefit ratio in mind, can then inform and accelerate the prioritization of investments needed to support safe system outcomes, ensuring an efficient and optimal interaction among all road users.

Project Purpose and Overview

The objective of this research was to develop a guide to advance the use of AI and ML in analyzing both conventional and unconventional data and assess their effectiveness in supporting safe system and modal priority decision-making as well as performance tracking. The resultant algorithms are expected to improve and optimize analyses using existing data and data-driven safety analysis tools developed based on conventional statistical modeling.

The purpose of the research is also to (1) pinpoint potential data sources, (2) identify or develop the requisite data preparation and extraction algorithms for use in safety analysis, and (3) document each sourceʼs coverage, frequency of collection, granularity, accessibility to practitioners, and cost. This data allows the potential for low-cost and more frequent generation of, among others: key fatality and injury prediction risk maps; road feature mapping; star ratings and other safety

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Artificial Intelligence and Big Data to Enhance Safety Analysis: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29098.

analyses for pedestrians, cyclists, motorcyclists, micro-mobility services, and vehicle occupants; identification of data for safety analyses and associated tools; and the development of safety plans that can be used for funding submissions and in prioritizing investments across the local and state road networks.

Finally, this research develops guidelines for managing data using a format that can be accessed by various tools. These guidelines have been tested through pilot projects to allow for appropriate adjustment and greater understanding. Developing guidelines enhances implementation and provides necessary information on the use of this data in safety systems and in determining modal priority needs. The results of this research could be included in national-level resources that support data-driven safety analysis.

This user guide facilitates the use of the proposed framework and associated AI and ML algorithms and models to undertake data-driven safety analyses generated by safety analysis tools. It includes case studies conducted during this projectʼs research phase to identify proposed data sources and methodologies and then test the efficacy of those methods to generate a benefit to safety analysis. Finally, the guide presents details on how similar data collection, analysis, and interpretation of results can be used by practitioners to improve safety analysis.

Definitions and Descriptions

The following are the definitions for AI, ML, and big data, as used in this research effort.

AI is a branch of computer science aimed at mimicking the problem-solving and decision-making capabilities of the human mind. The advancement of processing power, in conjunction with the development of numerous, ever more efficient AI algorithms and the growth in information generation related to all facets of life, has made AI ubiquitous in todayʼs world. Current data has shifted from letters and digits to encompass other forms such as audio, images, and videos. AI techniques excel at looking at large quantities of data of all forms, recognizing patterns from those datasets, and applying them to new scenarios. Examples include image recognition for unlocking a phone, personalized recommendations on any website, spam filters on emails, virtual assistants, and automatic translation.

In transportation engineering, various implementations of AI can already be seen, such as autonomous driving, real-time structural health monitoring systems, traffic forecasting, highway capacity analysis, and level of service determination. With changing transportation dimensions, such as the emergence of transportation as a service, improvements in autonomous vehicles, the degradation of infrastructure, and a need for efficient condition monitoring systems, the need for better prediction of future conditions and optimization of resource management is apparent. AI can help us leverage available information more efficiently and provide us with accurate information that can be used for decision-making. AI can also learn and adapt over time and provide more accurate results with additional data. The capabilities and applicability of AI are still being explored, and we can expect them to improve further and expand in the future.

ML involves using statistical learning and optimization methods that let computers analyze datasets and identify patterns. ML techniques leverage data mining to identify historical trends and inform future models. Components include a decision process (calculations/steps to identify patterns), an error function (quantifying the deviation of the guessʼs miss), and an optimization process that reduces potential errors in the next guess by considering past decisions. Updating these steps autonomously leads to improved analytical accuracy with each run (UC Berkeley 2020).

Big data is defined as larger, more complex data sets, including data from new sources. The size of the data sets (volume) makes it impossible for traditional data processing software to

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Artificial Intelligence and Big Data to Enhance Safety Analysis: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29098.

manage them. Additionally, the velocity of some big data streams may require real-time or near-real-time processing. And the variety of big data, often arriving in new, unstructured or semi-structured data types, requires additional preprocessing to derive meaning (Tiao 2024).

User Guide Organization

This user guide is organized into five chapters, as follows:

Chapter 1: Introduction starts with the importance of transportation safety in our society and shares the intended audience for this user guide. It describes the objective of the research—advancing the use of AI and ML in analyzing big data—and then defines each of those terms.

Chapter 2: Traditional Safety Evaluations presents an overview of the historical and current standard practices related to transportation safety analysis and their limitations.

Chapter 3: How AI, ML, and Big Data Can Improve Safety Evaluations makes the initial case for using AI, ML, and big data to improve safety analysis. In this chapter, new and enhanced data sets (e.g., trajectory data from video analytics and outputs from connected vehicles) and advanced analysis methods (e.g., ML models to identify objects in photos, algorithms, and databases to process big data sets) are described and suggestions on how they can be used to improve safety data analysis are provided.

Chapter 4: Applying AI, ML, and Big Data to Safety Analysis introduces step-by-step instructions to help practitioners and other researchers replicate the methods developed by the NCHRP project 17-100 team to similar data sets for similar purposes. It is organized around project case studies: two pilot research efforts conducted as part of this research project. One case study involved the Oregon Department of Transportation (ODOT) and applied the ML tools and approach developed by the research team to identify roadside objects, improving the completeness of Oregonʼs inventory database. The other, in partnership with the City of Bellevue in Washington State, involved analyzing object trajectories (a large dataset) generated outside this study using a video analytics platform.

Chapter 5: Additional AI, ML, and Big Data Applications introduces six follow-up research studies that were conducted, each of which builds upon the tools developed and methodologies studied as part of this research effort. These include the following: classified vehicle volume from loop detector data, turning movement counts (TMC) from connected vehicles data, lane markings and width from light detection and ranging sensors (LiDAR) data, traffic sign detection (TSD) and recognition from road log videos, pedestrian detection from mounted surveillance cameras, and road surface condition and vehicle volume from edge devices.

References and Bibliography lists the references cited in the report.

Abbreviations and Acronyms presents the abbreviations of keywords used in this report.

Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Artificial Intelligence and Big Data to Enhance Safety Analysis: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29098.
Page 1
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Artificial Intelligence and Big Data to Enhance Safety Analysis: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29098.
Page 2
Suggested Citation: "1 Introduction." National Academies of Sciences, Engineering, and Medicine. 2025. Leveraging Artificial Intelligence and Big Data to Enhance Safety Analysis: A Guide. Washington, DC: The National Academies Press. doi: 10.17226/29098.
Page 3
Next Chapter: 2 Traditional Safety Evaluations
Subscribe to Emails from the National Academies
Stay up to date on activities, publications, and events by subscribing to email updates.