Previous Chapter: 2 Research Approach
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.

3. Chapter 3 – Research Findings

This chapter summarizes the outcomes of the literature review, surveys, and in-depth interviews, as well as the case studies and methodology for ontology development.

3.1. Literature Review

The literature review focused on the definition of standard terms, types of ontologies, frameworks, and solutions used in knowledge engineering, as well as State DOT examples of re-engineering legacy information systems. To accomplish this, various materials were selected from academic and professional authors from the early 2000s to the present day. The research team believes that the diversity and breadth of the materials reviewed were sufficient to understand current practices, identify gaps, and leverage ongoing DOT practices in re-engineering legacy data systems. The following subsections highlight key findings and challenges, while Appendix A contains the complete literature review document. The outcome of the review process accomplished the following:

  1. Defined standard terms related to ontologies;
  2. Identified types of ontologies, including classifications and descriptions;
  3. Described the framework and solutions for knowledge engineering;
  4. Provided DOT examples in re-engineering legacy information systems; and
  5. Summarized organizational cultural factors that impact data ontology design.

3.1.1. Summary of Findings

The following findings were gathered from the literature:

  • There is an increasing need for semantic interoperability between legacy and emerging data systems to support effective and efficient decision-making across the business functions of DOTs.
  • It is evident that a business-driven data structure and model can support the development of effective information systems that leverage legacy systems.
  • Ontologies can be classified along a spectrum of informal to formal, depending on the degree of formality.
  • There are different approaches to developing ontologies, including top-down, middle-out, or bottom-up, depending on the level of conceptualization.
  • Research on ontologies in the infrastructure sector is advancing but has not developed enough to meet practice. Different ontologies across private and public transport systems focus on urban infrastructure planning and maintenance.
  • The literature contains several knowledge organization frameworks, tools, and methods for solving problems related to revitalizing legacy systems. However,
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
  • many have not proven to be practical because they tackle the issue from a technology perspective while failing to deliver semantically rich models of legacy systems’ business data.
  • Several emerging solutions that reuse legacy system information have emerged to enable the extraction of concepts from data in programs, the acquisition of requirements for enterprises to transform potential data and system users’ needs into complete, precise, and consistent requirement specifications, and the analysis, selection, and use of available tools.
  • State DOTs have conducted studies and developed tools and methodologies that serve as frameworks for integrating data programs, particularly those related to financial and accounting, asset management, and highway performance management, including safety. These examples provide real-world context for the future implementation of ontologies by DOT.
  • The creation of ontologies itself is a collaborative process that aims to streamline business processes, requiring strong organizational cultural strategies to be successful.
  • To sustain a culture that promotes the deployment and advancement of ontology use in data-driven decision-making, organizations must:
    • Develop a long-term strategy,
    • Secure leadership buy-in,
    • Establish a functional decision-making structure,
    • Implement effective workforce management strategies, and
    • Communicate the rewards associated with the change.

3.1.2. Challenges and Gap Analysis

Based on the understanding of the topic area and the literature, the following challenges were identified as the starting point for the Guide to address. The challenges and gaps include:

  • Legacy systems often lack internal data standards and documentation, which hinders the development of ontologies.
  • There is limited or no documentation (standard operating procedures) on DOT business functions and processes, which are foundational to building ontologies.
  • Ongoing integration of legacy systems and migration to an enterprise asset mainly focused on the technical and data integration aspects. They have not effectively addressed organizational cultural factors. Taking this holistic approach would enable DOTs to address resource limitation issues, such as manpower and financial resource needs, while also supporting program sustainability.
  • The constant turnover in State DOTs has exacerbated the loss of valuable knowledge for developing business rules and ontologies. To address this
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
  • challenge, agencies must adapt their management approaches, knowledge management strategies, and workforce development initiatives.
  • There is a gap in workforce knowledge and competencies regarding the development and application of data ontologies in State DOTs. As such, there would be a need to develop strategies for creating and delivering high-quality professional learning opportunities to acquire the necessary skill set in ontologies.

The guide focused on organizational, operational, and tactical strategies to address these challenges and the research objectives, enabling DOTs to develop and use data ontologies as part of their data-driven decision-making processes.

3.2. Targeted Surveys and In-depth Interviews

The surveys and interviews were conducted to understand how agencies implement data ontologies to maximize the use of legacy data and identify best practices for case study development. The following subsections summarize the results of the surveys and interviews; Appendix B contains the complete findings of both surveys and interviews.

3.2.1. Summary of Findings

Because data ontologies are relatively new and developing in the transportation data domain, most of the State DOTs that responded to this survey have little to no experience with them and do not have any formal or intentional ontologies. However, there was evidence that some agencies are developing ontologies during project lifecycles and creating metadata, data dictionaries, and data catalogs. One responding agency reported having proficient experience developing and maintaining ontologies. The agency mentioned using Stanford’s WebProtégé to develop these ontologies, including a data catalog and a proof-of-concept environmental ontology designed to make the agency’s manuals more accessible and searchable. It appears that DOTs are interested in integrating ontologies into their practices and recognize the benefits of implementing data ontologies.

The responding DOTs provided many examples of legacy systems and completed data migrations. The migrations were varied, encompassing financial systems, construction project management systems, LRS, and other systems. A common driver for migrating legacy systems was compliance with federal requirements. Many survey responses cited ARNOLD requirements as a motivation for migrating or upgrading their LRS to include other datasets. Another typical driver was the movement among DOTs towards using a centralized database. Many respondents said that providing their staff with more straightforward ways to access and use data improved operational efficiency, resulting in time and cost savings. Additional drivers included technological

Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.

advancement and the removal of costly, error-prone manual processes and outdated practices.

Major challenges encountered by DOTs during their migrations included poor data quality and limited access to data during integrations. These issues often arise from inadequate documentation and knowledge management risks, such as the departure of staff who are familiar with legacy systems. Additionally, the growing use of vendor solutions has introduced increased challenges to system compatibility. Respondents noted that new systems are frequently incompatible with existing systems or other applications, rendering the solutions either non-viable or requiring significant modification costs and resources.

3.2.2. Best Practices and Lessons Learned

These surveys and interviews with State DOTs revealed the following best practices and lessons learned.

  • Acknowledgement of the importance of data governance and strategic planning to advance data processes. Many agencies indicated that a major driver for migrating legacy systems was the need to depart from ad hoc, manual data processes that are time-consuming and significantly impacted by staff turnover. To achieve this, Arkansas DOT plans to establish a dedicated business unit for managing data governance and business intelligence. The Iowa DOT has developed a Strategic Data Business Plan and an Asset Management Plan to establish governance structures and implementation plans that will foster a more data-driven agency. The Minnesota DOT employs strategic, phased implementation plans when conducting data migrations, recognizing that migration processes are iterative and continually evolving. Other agencies, including the Vermont Agency of Transportation (VTrans), the Florida DOT (FDOT), and the Utah DOT (UDOT), prioritize involving stakeholders early in the process and garnering support from leadership to obtain funding and the necessary resources to establish data initiatives and perform legacy migrations. In some cases, this can help ease aversions to change that may hinder future data migrations and initiatives. UDOT specifically suggested starting with small groups that can effectively communicate the visions for data management changes from a business standpoint and then expanding those groups once they can demonstrate success.
  • The lack of intentional or documented enterprise-wide ontologies to support ongoing legacy system migration efforts. FDOT is in the process of migrating 79 legacy systems to create a cohesive, consistent database as part of Florida’s Cloud-First policy. UDOT is migrating Oracle systems, some of which are considered outdated, to the Google Cloud Platform, including a thorough evaluation of the project management business system data governance. Most agencies are developing or maintaining data dictionaries and business catalogs,
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
  • whether for enterprise-wide applications or to support a specific project. Additionally, many agencies have established data repositories that, in conjunction with data dictionaries, help keep staff informed about data availability and usage. VTrans specifically suggested training staff about the assumptions and accuracy of collected data. Documented ontologies were evident at the Washington State DOT (WSDOT), where WebProtégé was used to develop definitions and relationships of business terms across the agency. WSDOT has used publicly available ontologies such as the GIST upper ontology, which contains standard business definitions, as a starting point for multiple ontologies. The use of existing ontologies is a strong practice for ontology development, as the process can be arduous, and finding a practical starting point can significantly reduce the time and costs required for development.
  • Advancement of data standardization and centralization. State DOTs are making noticeable strides in standardizing and centralizing their data. Most of this progress is driven by state and federal compliance efforts. ARDOT uses a RoadID/Log Mile System to log data and ensure consistency across datasets. To meet ARNOLD requirements, ARDOT migrated its LRS to combine local road GIS data with NHS/Interstate GIS data, ensuring that all GIS road data now originate from a single source. The agency also favors permanent data changes over reactive ones to prepare for future migrations. FDOT promotes data accessibility by modeling enterprise data with a centralized view through its DIMM program and using extensive metadata. MnDOT does not have ontologies; however, to maintain consistent definitions across business units, the agency uses a controlled vocabulary with drop-down menus and reference data for its systems. Multiple agencies noted that documentation is paramount for standardization and migration efforts.

3.3. Case Studies

The purpose of the case studies was to showcase industry practices, techniques, and methodologies that facilitate the development of data ontologies and provide examples of successfully implemented ontologies. Essential techniques and takeaways from the case studies were summarized into themes, including effective planning, comprehensive assembly, validation, and improvement of ontologies, which are translatable to any domain. The case studies used examples from different domains to illustrate the application of the techniques and strategies for building ontologies. The degree of maturity of the cases ranges from proof-of-concept and pilot projects to fully implemented and maintained efforts. Five case studies were developed and incorporated into the Guide as examples of best practices. The subsections that follow highlight the key takeaways from the case studies.

Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.

3.3.1. Summary of Findings

This subsection summarizes key takeaways from the cases. These takeaways have been organized into a structured framework for designing a plan as an instrument for success, efficiently assembling the ontology, testing the ontology for its efficacy, and adapting to future adjustments to keep the ontology relevant and timeless.

Designing a Plan

  • Leverage Existing Policies. When creating an ontology development plan, it is essential to leverage an agency’s existing policies to support the process. FDOT’s Cloud-First Policy is an excellent example of an opportunity to leverage an existing policy that promotes system interoperability and technological advancement. By migrating its legacy systems to a cloud-based solution, FDOT can integrate its legacy data with current, modern data, creating a centralized platform to access the agency’s information. While FDOT has no documented ontologies, the agency can boast a more streamlined data-gathering process, often one of the most resource-intensive steps of ontology development.
  • Plan Thoroughly and Effectively. Ontology development thrives on effective initial planning. This process begins by identifying the issues, clarifying the purpose of the ontology, and determining the required resources for success. Thorough initial planning provides opportunities to consider less time-consuming alternatives to an ontology, such as a data dictionary or metadata database. For example, WSDOT created its Data Catalog initiative, an accessible database that allows users to search for terms using visual diagrams and their semantic relationships. The development plan must also address how the relevant data and success information will be identified, collected, and communicated to achieve sustainable buy-in. An example of effective communication to secure buy-in is WSDOT’s Words Matter initiative. Words Matter effectively spread information about WSDOT’s data catalog ontology throughout the agency. Bringing greater awareness to an ontology development can communicate its value and garner leadership and user buy-in.
  • Consider Resource Needs. Ontology development is a resource-intensive process requiring sustainable funding, a committed staff, and leadership buy-in. An ontology is at a high risk of abandonment without a dedicated funding source. For example, WSDOT’s PS AID project was developed as a proof-of-concept for organizing information in environmental manuals into an ontology. While the benefits were acknowledged, dedicated funding was not secured, and the project was halted due to the lack of funding. In addition to dedicated funding, dedicated staff is also critical to the development process. Ontology development requires both technical and subject matter expertise. Identifying the proper personnel in the planning phase can help to ensure that any knowledge risks are mitigated. These needs can be addressed by clearly defining the value of
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
  • the ontology and demonstrating the benefits to leadership through effective communication of progress and small wins.
  • Leverage Ongoing Data Initiatives. Data governance, data management, and enterprise data system integration can play significant roles in preparing and developing an ontology. By having a centralized and consistent database, the ontology development team will have a single source of truth to build the ontology. This ensures accuracy and consistency within the ontology, streamlining the process as the development team no longer needs to search for the required data and information. Ontologies can also aid legacy system migrations. The migration of legacy systems becomes exponentially more complicated as the number of systems increases. These systems often have dependencies and relationships with other systems that can cause issues when migrating one without the others. Ontologies are well-suited to aid this process by capturing all the crucial information about enterprise systems that is required for migrating them.

Assembling the Ontology

  • Use Existing Ontologies. A wide range of public ontologies is available as a resource for developing other ontologies. Proper research and identification of existing ontologies that overlap with the proposed ontology can provide a strong foundation, eliminating the need for extensive early development. For example, WSDOT leveraged the gist upper ontology, which encompasses numerous standard business terms, definitions, and semantic relationships. Additionally, Transit Ontology highlights a research group that builds upon another research group’s work 15 years later. Furthermore, Disease Ontology (DO) has served as the basis for numerous other ontologies in the biomedical field. The developers demonstrate how DO is used to develop and inform other information bases through use cases.
  • Involve Stakeholders and Gather Feedback. During the development phase, it is essential to consult with key stakeholders to ensure that the development process aligns with the goal. Feedback should be gathered from the intended users, ontology experts, and subject matter experts. This input can be used to refine the ontology and identify potential areas for improvement. For example, the team behind the DO development collaborated with the biomedical ontology community to gather information and ensure that their needs were met. This collaboration used a publicly available GitHub repository that contained instructions on adding entries to the ontology, including guidelines on formatting and wording.
  • Use Guiding Questions. Guiding questions should be used to guide the development process. These questions should be asked continuously throughout the process and addressed so that the ontology can provide answers at any given point. By asking the appropriate questions, ontology developers can identify
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
  • situations that the ontology does not yet address correctly. The transit ontology presented earlier has practical examples of questions that drive ontology development. The questions are potential user queries directly tied to the purpose of the ontology. After development, these queries can be tested to verify and validate that the ontology gives reasonable results.
  • Create Visualizations. Before, during, and after development, visualizations of the ontology can help improve understanding for both users and developers. In developing the Digital Highway Construction Inspection Ontology (HCIOntology) (Case Study 1 in the Guide), Indiana DOT created several visuals to depict a high-level overview of the ontological relationships between different classes. Additionally, the WSDOT Data Catalog used visuals to illustrate specific terms, enabling users to visualize and better understand the relationships between entities and classes. Creating these visualizations during development can give the team a quick reference to the ontology framework.

Testing the Ontology

  • Verify and Validate the Ontology. Verifying that the ontology produces reasonable results based on the queries and restrictions is essential. In addition, using use cases to validate the output drives confidence in the ontology. There are tools available that check ontologies for contradictions and issues. One of these is HermiT, which is packaged within the commonly used Protégé ontology-building tool. The HermiT tool checks all semantic relationships for consistency. Additionally, SPARQL queries are useful for verifying ontologies through queries. These queries can verify reasonable and accurate results, which help validate an ontology. For instance, HCIOntology used HermiT and SPARQL queries to verify the ontology’s consistency and obtain reasonable results. It is essential to note that this validation also requires subject matter expertise to comprehend the nuances of the ontology and recognize what constitutes reasonable and correct results.

Adapting to Changes

  • Keep the Ontology Updated. Ontology building is a continuous journey that requires constant improvements. The ontology must be refined and updated as new terms, definitions, and semantic relationships are created or identified. As mentioned earlier, ontology development is resource-intensive, so continuous updates are essential to maintain relevance. The DO is an excellent example of this, as the developers consistently update the database with new terms and diseases as they become available. The HCIOntology developers update the ontology to ensure consistency with the yearly updates to the construction inspection standard specification. Defining a structured and clear workflow that allows for the updates to be gathered, checked, approved, and communicated is essential to maintaining the consistency and accuracy of the ontology.
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
  • Maximize Ontology Value. During and after initial ontology development, the development team can explore ways to gather feedback on how to enhance and maximize the value of the ontology. It is possible that additional use cases may be identified and, resource permitting, implemented into the ontology. As WSDOT developed the Data Catalog, the agency recognized that it could gain additional value by expanding beyond the original scope. The project initially started as an assessment of the agency’s data maturity. It then evolved into the Data Catalog, which was further enhanced by the Words Matter initiative. Additionally, methods of presenting the ontology should be explored. Application, dashboards, and searchable dictionaries are common ways ontologies are implemented to maximize use. For example, HCIOntology was integrated into an application that dynamically generates inspection forms based on guided inputs from an inspector. This automated a previously manual process, saving time and reducing the risk of human error. The HCIOntology team is also looking into using AI and Natural Language Processing (NLP) to more efficiently gather and organize the data for the ontology and provide users with more streamlined access to the information.
  • Build Ontology Libraries. Ontology libraries are systems that gather, document, and store ontologies from various sources, enabling users to discover, explore, and use the ontologies and their associated classes and hierarchies. If an agency is looking to develop multiple ontologies, using an ontology library to create a centralized resource for staff to access the information could help maximize the value and future effectiveness of the ontologies. For example, DO is part of BioPortal, a library of biomedical ontologies. This ontology library comprises over 1,100 biomedical ontologies, enabling users to search for both specific ontologies and classes within them. Transportation agencies can benefit from a simple repository of their ontologies for internal and external use by partners and peers.

If planned properly, these takeaways offer practical insights, guiding transportation agencies on institutionalizing data ontologies across functional divisions, business areas, and operational units in a structured approach.

Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 11
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 12
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 13
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 14
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 15
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 16
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 17
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 18
Suggested Citation: "3 Research Findings." National Academies of Sciences, Engineering, and Medicine. 2026. Data Ontologies for Data-Driven Decision-Making: Research Approach and Findings. Washington, DC: The National Academies Press. doi: 10.17226/29374.
Page 19
Next Chapter: 4 Ontology Development Framework
Subscribe to Emails from the National Academies
Stay up to date on activities, publications, and events by subscribing to email updates.