The Committee on Foundational Research Gaps and Future Directions for Digital Twins uses the following definition of a digital twin, modified from a definition published by the American Institute of Aeronautics and Astronautics (AIAA Digital Engineering Integration Committee 2020):
A digital twin is a set of virtual information constructs that mimics the structure, context, and behavior of a natural, engineered, or social system (or system-of-systems), is dynamically updated with data from its physical twin, has a predictive capability, and informs decisions that realize value. The bidirectional interaction between the virtual and the physical is central to the digital twin.
The study committee’s refined definition refers to “a natural, engineered, or social system (or system-of-systems)” to describe digital twins of physical systems in the broadest sense possible, including the engineered world, natural phenomena, biological entities, and social systems. This definition introduces the phrase “predictive capability” to emphasize that a digital twin must be able to issue predictions beyond the available data to drive decisions that realize value. Finally, this definition highlights the bidirectional interaction, which comprises feedback flows of information from the physical system to the virtual representa-
__________________
NOTE: This summary highlights key messages from the report but is not exhaustive. In order to support the flow and readability of this abridged summary, the findings, conclusions, and recommendations may be ordered differently than in the main body of the report, but they retain the same numbering scheme for searchability.
tion and from the virtual back to the physical system to enable decision-making, either automatic or with humans-in-the-loop.
Digital twins hold immense promise in accelerating scientific discovery and revolutionizing industries. Digital twins can be a critical tool for decision-making based on a synergistic combination of models and data. The bidirectional interplay between a physical system and its virtual representation endows the digital twin with a dynamic nature that goes beyond what has been traditionally possible with modeling and simulation, creating a virtual representation that evolves with the system over time. By enabling predictive insights and effective optimizations, monitoring performance to detect anomalies and exceptional conditions, and simulating dynamic system behavior, digital twins have the capacity to revolutionize scientific research, enhance operational efficiency, optimize production strategies, reduce time-to-market, and unlock new avenues for scientific and industrial growth and innovation. The use cases for digital twins are diverse and proliferating, with applications across multiple areas of science, technology, and society, and their potential is wide-reaching. Yet key research needs remain to advance digital twins in several domains.
This report is the result of a study that addressed the following key topics:
While there is significant enthusiasm around industry developments and applications of digital twins, the focus of this report is on identifying research gaps and opportunities. The report’s recommendations are particularly targeted toward what agencies and researchers can do to advance mathematical, statistical, and computational foundations of digital twins.
The notion of a digital twin goes beyond simulation to include tighter integration between models, data, and decisions. The dynamic, bidirectional interaction tailors the digital twin to a particular physical counterpart and supports the evolution of the virtual representation as the physical counterpart evolves. This bidirectional interaction is sometimes characterized as a feedback loop, where data from the physical counterpart are used to update the virtual models, and, in turn, the virtual models are used to drive changes in the physical system.
This feedback loop may occur in real time, such as for dynamic control of an autonomous vehicle or a wind farm, or it may occur on slower time scales, such as post-flight updating of a digital twin for aircraft engine predictive maintenance or post-imaging updating of a digital twin and subsequent treatment planning for a cancer patient.
The digital twin provides decision support when a human plays a decision-making role, or decision-making may be shared jointly between the digital twin and a human as a human–agent team. Human–digital twin interactions may also involve the human playing a crucial role in designing, managing, and operating elements of the digital twin, including selecting sensors and data sources, managing the models underlying the virtual representation, and implementing algorithms and analytics tools.
Finding 2-1: A digital twin is more than just simulation and modeling.
Conclusion 2-1: The key elements that comprise a digital twin include (1) modeling and simulation to create a virtual representation of a physical counterpart, and (2) a bidirectional interaction between the virtual and the physical. This bidirectional interaction forms a feedback loop that comprises dynamic data-driven model updating (e.g., sensor fusion, inversion, data assimilation) and optimal decision-making (e.g., control, sensor steering).
These elements are depicted in Figure S-1.
An important theme that runs throughout this report is the notion that the digital twin virtual representation be “fit for purpose,” meaning that the virtual representation—model types, fidelity, resolution, parameterization, and quantities of interest—be chosen, and in many cases dynamically adapted, to fit the particular decision task and computational constraints at hand.
Conclusion 3-1: A digital twin should be defined at a level of fidelity and resolution that makes it fit for purpose. Important considerations are the required level of fidelity for prediction of the quantities of interest, the available computational resources, and the acceptable cost. This may lead to the digital twin including high-fidelity, simplified, or surrogate models, as well as a mixture thereof. Furthermore, a digital twin may include the ability to represent and query the virtual models at variable levels of resolution and fidelity depending on the particular task at hand and the available resources (e.g., time, computing, bandwidth, data).
An additional consideration is the complementary role of models and data—a digital twin is distinguished from traditional modeling and simulation in the way that models and data work together to drive decision-making. In cases in
which an abundance of data exists and the decisions to be made fall largely within the realm of conditions represented by the data, a data-centric view of a digital twin is appropriate—the data form the core of the digital twin, the numerical model is likely heavily empirical, and analytics and decision-making wrap around this numerical model. In other cases that are data-poor and call on the digital twin to issue predictions in extrapolatory regimes that go well beyond the available data, a model-centric view of a digital twin is appropriate—a mathematical model and its associated numerical model form the core of the digital twin, and data are assimilated through the lens of these models. An important need is to advance hybrid modeling approaches that leverage the synergistic strengths of data-driven and model-driven digital twin formulations.
Despite the existence of examples of digital twins providing practical impact and value, the sentiment expressed across multiple committee information-gathering sessions is that the publicity around digital twins and digital twin solutions currently outweighs the evidence base of success.
Conclusion 2-5: Digital twins have been the subject of widespread interest and enthusiasm; it is challenging to separate what is true from what is merely aspirational, due to a lack of agreement across domains and sectors as well as misinformation. It is important to separate the aspirational from the actual to strengthen the credibility of the research in digital twins and to recognize that serious research questions remain in order to achieve the aspirational.
Conclusion 2-6: Realizing the potential of digital twins requires an integrated research agenda that advances each one of the key digital twin elements and, importantly, a holistic perspective of their interdependencies and interactions. This integrated research agenda includes foundational needs that span multiple domains as well as domain-specific needs.
Recommendation 1: Federal agencies should launch new crosscutting programs, such as those listed below, to advance mathematical, statistical, and computational foundations for digital twins. As these new digital twin–focused efforts are created and launched, federal agencies should identify opportunities for cross-agency interactions and facilitate cross-community collaborations where fruitful. An interagency working group may be helpful to ensure coordination.
Verification, validation, and uncertainty quantification (VVUQ) is an area of particular need that necessitates collaborative and interdisciplinary investment to advance the responsible development, implementation, monitoring, and sustainability of digital twins. Evolution of the physical counterpart in real-world use conditions, changes in data collection, noisiness of data, addition and deletion of data sources, changes in the distribution of the data shared with the virtual twin, changes in the prediction and/or decision tasks posed to the digital twin, and evolution of the digital twin virtual models all have consequences for VVUQ.
VVUQ must play a role in all elements of the digital twin ecosystem. In the digital twin virtual representation, verification and validation play key roles in building trustworthiness, while uncertainty quantification gives measures of the quality of prediction. Many of the elements of VVUQ for digital twins are shared with VVUQ for computational models (NRC 2012), although digital twins bring some additional challenges. Common challenges arise from model discrepancies, unresolved scales, surrogate modeling, and the need to issue predictions in extrapolatory regimes. However, digital twin VVUQ must also address the uncertainties associated with the physical counterpart, including changes to sensors or data collection equipment, and the continual evolution of the physical counterpart’s state. Data quality improvements may be prioritized based on the relative impacts of parameter uncertainties on the model uncertainties. VVUQ also plays a role in understanding the impact of mechanisms used to pass information between the physical and virtual. These include challenges arising from parameter uncertainty and ill-posed or indeterminate inverse problems, in addition to the uncertainty introduced by the inclusion of the human-in-the-loop.
Conclusion 2-2: Digital twins require VVUQ to be a continual process that must adapt to changes in the physical counterpart, digital twin virtual models, data, and the prediction/decision task at hand. A gap exists between the class of problems that has been considered in traditional modeling and simulation settings and the VVUQ problems that will arise for digital twins.
Conclusion 2-3: Despite the growing use of artificial intelligence, machine learning, and empirical modeling in engineering and scientific applications, there is a lack of standards in reporting VVUQ as well as a lack of consideration of confidence in modeling outputs.
Conclusion 2-4: Methods for ensuring continual VVUQ and monitoring of digital twins are required to establish trust. It is critical that VVUQ be deeply embedded in the design, creation, and deployment of digital twins. In future digital twin research developments, VVUQ should play a core role and tight integration should be emphasized. Particular areas of research need include continual verification, continual validation, VVUQ in extrapolatory conditions, and scalable algorithms for complex multiscale, multiphysics, and multi-code digital twin software efforts. There is a need to establish to what extent VVUQ approaches can be incorporated into automated online operations of digital twins and where new approaches to online VVUQ may be required.
Recommendation 2: Federal agencies should ensure that verification, validation, and uncertainty quantification (VVUQ) is an integral part of new digital twin programs. In crafting programs to advance the digital twin VVUQ research agenda, federal agencies should pay attention to the importance of (1) overarching complex multiscale, multiphysics problems as catalysts to promote interdisciplinary cooperation; (2) the availability and effective use of data and computational resources; (3) collaborations between academia and mission-driven government laboratories and agencies; and (4) opportunities to include digital twin VVUQ in educational programs. Federal agencies should consider the Department of Energy Predictive Science Academic Alliance Program as a possible model to emulate.
A fundamental challenge for digital twins is the vast range of spatial and temporal scales that the virtual representation may need to address. In many applications, a gap remains between the scales that can be simulated and actionable scales. An additional challenge is that as finer scales are resolved and a given model achieves greater fidelity to the physical counterpart it simulates, the computational and data storage/analysis requirements increase. This limits the applicability of the model for some purposes, such as uncertainty quantification, probabilistic prediction, scenario testing, and visualization.
Finding 3-2: Different applications of digital twins drive different requirements for modeling fidelity, data, precision, accuracy, visualization, and time-to-solution, yet many of the potential uses of digital twins are currently intractable to realize with existing computational resources.
Recommendation 3: In crafting research programs to advance the foundations and applications of digital twins, federal agencies should create mechanisms to provide digital twin researchers with computational resources, recognizing the large existing gap between simulated and actionable scales and the differing levels of maturity of high-performance computing across communities.
Mathematical and algorithmic advances in data-driven modeling and multiscale physics-based modeling are necessary elements for closing the gap between simulated and actionable scales. Reductions in computational and data requirements achieved through algorithmic advances are an important complement to increased computing resources. Important areas to advance include hybrid modeling approaches—a synergistic combination of empirical and mechanistic modeling approaches that leverage the best of both data-driven and model-driven formulations—and surrogate modeling approaches. Key gaps, research needs, and opportunities include the following:
Digital twins rely on observation of the physical counterpart in conjunction with modeling to inform the virtual representation. In many applications, these data will be multimodal, from disparate sources, and of varying quality. While significant literature has been devoted to best practices around gathering and pre-
paring data for use, several important gaps and opportunities are crucial for robust digital twins. Key gaps, research needs, and opportunities include the following:
In the digital twin feedback flow from physical to virtual, inverse problem methodologies and data assimilation are required to combine physical observations and virtual models in a rigorous, systematic, and scalable way. Specific challenges for digital twins such as calibration and updating on actionable time scales highlight foundational gaps in inverse problem and data assimilation theory, methodology, and computational approaches. ML and artificial intelligence (AI) have potential large roles to play in addressing these challenges, such as through the use of online learning techniques for continuously updating models using streaming data. In addition, in settings where data are limited due to data acquisition resource constraints, AI approaches such as active learning and reinforcement learning can help guide the collection of additional data most salient to the digital twin’s objectives.
On the virtual-to-physical flowpath, the digital twin is used to drive changes in the physical counterpart itself, or in the observational systems associated with the physical counterpart through an automatic controller or a human.
Accordingly, the committee identified gaps associated with the use of digital twins for automated decision-making tasks, for providing decision support to a human decision-maker, and for decision tasks that are shared jointly within a human–agent team. There are additional challenges associated with the ethics and social implications of the use of digital twins in decision-making. Key gaps, research needs, and opportunities in the physical-to-virtual and virtual-to-physical feedback flows include the following:
Protecting individual privacy requires proactive ethical consideration at every phase of development and within each element of the digital twin ecosystem. Moreover, the tight integration between the physical system and its virtual representation has significant cybersecurity implications, beyond what has historically been needed, that must be considered in order to effectively safeguard and scale digital twins. While security issues with digital twins share common challenges with cybersecurity issues in other settings, the close relationship between cyber and physical in digital twins could make cybersecurity more challenging. Privacy, ownership, and responsibility for data accuracy in complex, heterogeneous digital twin environments are all areas with important open questions that require attention. While the committee noted that many data ethics and governance issues fall outside the study’s charge, it is important to highlight the dangers of scaling digital twins without actionable standards for appropriate use and guidelines for identifying liability in the case of misuse. Furthermore, digital twins necessitate heightened levels of security, particularly around the transmission of data and information between the physical and virtual counterparts. Especially in sensi-
tive or high-risk settings, malicious interactions could result in security risks for the physical system. Additional safeguard design is necessary for digital twins.
Realizing the societal benefits of digital twins will require both incremental and more dramatic research advances in cross-disciplinary approaches. In addition to bridging fundamental research challenges in statistics, mathematics, and computing, bringing complex digital twins to fruition necessitates robust and reliable yet agile and adaptable integration of all these disparate pieces.
Over time, the digital twin will likely need to meet new demands, incorporate new or updated models, and obtain new data from the physical system to maintain its accuracy. Model management is key for supporting the digital twin evolution. For a digital twin to faithfully reflect temporal and spatial changes where applicable in the physical counterpart, the resulting predictions must be reproducible, incorporate improvements in the virtual representation, and be reusable in scenarios not originally envisioned. This, in turn, requires a design approach to digital twin development and evolution that is holistic, robust, and enduring, yet flexible, composable, and adaptable. Digital twins require a foundational backbone that, in whole or in part, is reusable across multiple domains, supports multiple diverse activities, and serves the needs of multiple users. Digital twins must seamlessly operate in a heterogeneous and distributed infrastructure supporting a broad spectrum of operational environments, ranging from hand-held mobile devices accessing digital twins on-the-go to large-scale, centralized high-performance computing installations. Sustaining a robust, flexible, dynamic, accessible, and secure digital twin is a key consideration for creators, funders, and the diverse community of stakeholders.
Conclusion 7-1: The notion of a digital twin has inherent value because it gives an identity to the virtual representation. This makes the virtual representation—the mathematical, statistical, and computational models of the system and its data—an asset that should receive investment and sustainment in ways that parallel investment and sustainment in the physical counterpart.
Recommendation 4: Federal agencies should each conduct an assessment for their major use cases of digital twin needs to maintain and sustain data, software, sensors, and virtual models. These assessments should drive the definition and establishment of new programs similar to the National Science Foundation’s Natural Hazards Engineering Research Infrastructure and Cyberinfrastructure for Sustained Scientific
Innovation programs. These programs should target specific communities and provide support to sustain, maintain, and manage the life cycle of digital twins beyond their initial creation, recognizing that sustainability is critical to realizing the value of upstream investments in the virtual representations that underlie digital twins.
There are domain-specific and even use-specific digital twin challenges, but there are also many elements that cut across domains and use cases. For digital twin virtual representations, advancing the models themselves is necessarily domain-specific, but advancing the digital twin enablers of hybrid modeling and surrogate modeling embodies shared challenges that crosscut domains. For the physical counterpart, many of the challenges around sensor technologies and data are domain-specific, but issues around handling and fusing multimodal data, enabling access to data, and advancing data curation practices embody shared challenges that crosscut domains. When it comes to the physical-to-virtual and virtual-to-physical flows, there is an opportunity to advance data assimilation, inverse methods, control, and sensor-steering methodologies that are applicable across domains, while at the same time recognizing domain-specific needs, especially as they relate to the domain-specific nature of decision-making. Finally, there is a significant opportunity to advance digital twin VVUQ methods and practices in ways that translate across domains.
As stakeholders consider architecting programs that balance these domain-specific needs with cross-domain opportunities, it is important to recognize that different domains have varying levels of maturity with respect to the different elements of the digital twin. For example, the Earth system science community is a leader in data assimilation; many fields of engineering are leaders in integrating VVUQ into simulation-based decision-making; and the medical community has a strong culture of prioritizing the role of a human decision-maker when advancing new technologies. Cross-domain interactions through the common lens of digital twins are opportunities to share, learn, and cross-fertilize.
Conclusion 7-2: As the foundations of digital twins are established, it is the ideal time to examine the architecture, interfaces, bidirectional workflows of the virtual twin with the physical counterpart, and community practices in order to make evolutionary advances that benefit all disciplinary communities.
Recommendation 5: Agencies should collaboratively and in a coordinated fashion provide cross-disciplinary workshops and venues to foster identification of those aspects of digital twin research and development that would benefit from a common approach and which specific research
topics are shared. Such activities should encompass responsible use of digital twins and should necessarily include international collaborators.
Recommendation 6: Federal agencies should identify targeted areas relevant to their individual or collective missions where collaboration with industry would advance research and translation. Initial examples might include the following:
There is a history of both sharing and coordination of models within the international climate research community as well as a consistent commitment to data exchange that is beneficial to digital twins. While other disciplines have open-source or shared models, few support the breadth in scale and the robust integration of uncertainty quantification that are found in Earth system models and workflows. A greater level of coordination among the multidisciplinary teams of other complex systems, such as biomedical systems, would benefit maturation and cultivate the adoption of digital twins.
Conclusion 7-4: Fostering a culture of collaborative exchange of data and models that incorporate context through metadata and provenance in digital twin–relevant disciplines could accelerate progress in the development and application of digital twins.
Recommendation 7: In defining new digital twin research efforts, federal agencies should, in the context of their current and future mission priorities, (1) seed the establishment of forums to facilitate good practices for effective collaborative exchange of data and models across disciplines and domains, while addressing the growing privacy and ethics demands of digital twins; (2) foster and/or require collaborative exchange of data and models; and (3) explicitly consider the role for collaboration and coordination with international bodies.
The successful adoption and progress of digital twins hinge on the appropriate education and training of the workforce. This educational shift requires formalizing, nurturing, and growing critical computational, mathematical, and engineering skill sets at the intersection of disciplines such as biology, chemistry, and physics. These critical skill sets include but are not limited to systems engineering, systems thinking and architecting, data analytics, ML/AI, statistical/probabilistic modeling and simulation, uncertainty quantification, computational mathematics, and decision science. These disciplines are rarely taught within the same academic curriculum.
Recommendation 8: Within the next year, federal agencies should organize workshops with participants from industry and academia to identify barriers, explore potential implementation pathways, and incentivize the creation of interdisciplinary degrees at the bachelor’s, master’s, and doctoral levels.
AIAA (American Institute of Aeronautics and Astronautics) Digital Engineering Integration Committee. 2020. “Digital Twin: Definition & Value.” AIAA and AIA Position Paper, AIAA, Reston, VA.
NRC (National Research Council). 2012. Assessing the Reliability of Complex Models: Mathematical and Statistical Foundations of Verification, Validation, and Uncertainty Quantification. Washington, DC: The National Academies Press.