Many Department of Energy (DOE) missions demand rapid analysis and decision making under urgent national security or economic constraints. Geopolitical instability can abruptly disrupt access to critical materials essential for defense systems, requiring the swift identification and qualification of substitutes (Dingreville et al. 2024). Shifts in global manufacturing or adversaries’ adoption of advanced technologies often force DOE programs to adapt legacy tools and processes to new material systems where empirical data may be scarce and existing models unreliable. Similarly, analysts must forecast the outcomes of nonproliferation or emergency scenarios constrained by complex physical dynamics, such as weather evolution or blast propagation (EoP 2022). These challenges conflict with the traditional trial-and-error discovery cycle that still dominates materials development and qualification. Recent work highlights how data-driven foundation models, integrated with physics-based simulations, can sharply compress these timelines from years to days by guiding targeted experiments and enabling high-fidelity predictions of novel engineered systems (Frey et al. 2025).
The national laboratories hold deep institutional expertise, embedded in their workforce, legacy data sets, and extensive experimental and modeling infrastructure. Yet the sheer scale of the DOE system, characterized by siloed specialized knowledge and the complexity of coordinating a large, distributed workforce, can be fundamentally misaligned with the speed and flexibility required for rapid decision making. Foundation models pose a unique opportunity to automate the coordination of personnel, user facilities and other experimental infrastructure, and historical data to address this long-standing issue of institutional inertia.
Conclusion 4-1: Many DOE missions demand rapid analysis and decision making under urgent national security or economic constraints. While the national laboratories hold deep institutional expertise—embedded in their workforce, legacy data sets, and extensive experimental and modeling infrastructure—the sheer scale of the DOE system, characterized by siloed specialized knowledge and the complexity of coordinating a large, distributed workforce, can be misaligned with the agility required for decisive action. Development of foundation models for this purpose poses a unique opportunity to address rapid analysis and decision making.
Recommendation 4-1: The Department of Energy should explore the use of foundation models to accelerate situational understanding by unifying dispersed, siloed, and diverse multimodal data sources as input to decision-making frameworks across heterogeneous environments.
In contrast to industrial AI, DOE invested early in material informatics and high-throughput experimental data curation campaigns to build unique access to data sets, through the Material Genome Initiative and other efforts. By combining advanced AI models, high-performance computing, and curated experimental data, materials informatics can dramatically reduce the search space for viable material substitutes or processes. Recent successes demonstrate this potential: for example, generative machine learning approaches have identified candidate alloy systems that reduce dependence on critical rare Earth elements while preserving key performance properties (Dingreville et al. 2024). In another instance, Microsoft researchers screened over 30 million hypothetical compounds to identify new battery cathode chemistries that could cut lithium demand by as much as 70 percent; a discovery pipeline that traditionally would have required years of sequential lab work (Baker 2024). Given DOE’s strong software ecosystem, they are uniquely positioned to combine existing efforts where high-throughput fabrication and characterization can be integrated with simulators and knowledge graphs encoding the literature to rapidly identify candidate alternatives for critical materials, processes for manufacturing novel materials, and tools for predicting new materials in poorly understood regimes.
Among DOE’s most unique and critical resources are its large-scale user facilities, specialized manufacturing foundries, high-performance computing centers, and shared experimental platforms. Many of these facilities are already equipped with an astronomical number of sensors, generating enormous amounts of data that could be exploited for scientific discovery and process optimization.
Advanced manufacturing facilities such as the Kansas City Plant and Y-12 National Security Complex offer unique opportunities for tailored process improvements if information can be analyzed in a decentralized manner while maintaining necessary controls on classified or sensitive data. The Office of Science has previously invested in federated learning approaches to develop distributed machine learning policies across fleets of assets, including user facilities and other systems, with theoretical guarantees of differential privacy. Related efforts have explored how advanced manufacturing processes, such as metal additive manufacturing, can be coordinated across identical machines operating at multiple sites where local conditions affect performance.
There is now a significant opportunity to integrate these federated systems with foundation models that can process distributed data streams or coordinate physical processes across heterogeneous environments. Such models could take multiple forms: large language models (LLMs) that augment scientists’ ability to manage complex distributed systems; agent-based frameworks that execute control policies or distributed data processing; or real-time physics simulators that interpret and contextualize sensor data at scale.
As DOE’s workforce turns over, the challenge of maintaining legacy weapons systems and associated hardware or software tools becomes increasingly burdensome; frequently, a single scientist may hold a disproportionate amount of expertise on a given component or system. As staff transition to retirement or alternative career paths, their hard drives may contain vast swaths of data, simulation configuration files, and source code that would take substantial time and financial investment to reproduce. Simultaneously, as new staff are hired, it is broadly understood that there is a steep learning curve to train on the deeply technical software and modeling frameworks used across the laboratories. Foundation models offer a technique to automate the consolidation of existing knowledge and can be used in a copilot configuration to train new members of the workforce, particularly in legacy programming languages or hardware systems that are rarely taught in contemporary university programs.
Some of the most promising demonstrations of AI-augmented physics simulation have emerged in short-term weather forecasting, where ubiquitous reanalysis data have enabled models that can deliver real-time predictions on a single graphics processing unit, dramatically reducing the computational cost compared to conventional partial differential equation–based solvers at exascale. This creates strategic opportunities to adopt these tools to enhance data-driven decision making and to integrate them into existing physics-based modeling campaigns.
Unique to DOE’s mission is the requirement to fuse weather prediction with additional sensing modalities relevant to national security. For example, nonproliferation and counter-terrorism tasks often rely on combining weather models with satellite imagery and other geospatial data. Early industry examples, such as Microsoft’s real-time weather foundation models, demonstrate that these models can serve as effective multitasking platforms that generalize well to satellite data streams and other remote sensing tasks.
Beyond this immediate application, the prevalence of diverse scientific data across DOE highlights an opportunity to advance a distinctive form of multimodal learning, extending beyond the text, audio, and video focus common in commercial AI. For example, in stockpile stewardship, it is often necessary to fuse heterogeneous material characterization data—such as X-ray diffraction, electron microscopy, user facility measurements, and high-fidelity simulations—with knowledge graphs and other structured sources, including classified information. Developing foundation models capable of reasoning across such multimodal scientific data streams could establish a unique capability aligned with DOE’s national security and scientific missions.
While large industrial AI companies have deep expertise in first-order optimizers, automatic differentiation, and other numerical methods central to machine learning, DOE remains a global leader in advanced scientific computing, including large-scale linear algebra; high-performance numerical solvers; higher-order, structure-preserving, and large-scale constrained optimization libraries; and frameworks for discretizing the partial differential equations that underpin scientific simulation. There is a major opportunity to bridge this substantial investment in foundational scientific software with the next generation of foundation models, whether developed by industry or within DOE itself.
As machine learning was initially applied to scientific problems, there was a reluctance within DOE to compete with TensorFlow or PyTorch. At this point, libraries are relatively mature, and open-source libraries such as Trilinos could serve a valuable role in developing lightweight wrapper libraries to facilitate the
interfacing of production codes with LLMs. Several notable DOE codes such as MFEM and Albany have begun exposing automatic differentiation and adjoint calculations in a manner that could be accessed by an LLM (MFEM n.d.; Salinger et al. 2016). DOE has invested in higher-level runtime systems that simplify the programming of distributed-memory environments. Frameworks such as Charm++/AMPI (Kale and Krishnan 1993), Legion (Bauer et al. 2012), UPC++ (Bachan et al. 2019), Global Arrays (Nieplocha et al. 1994), and HPX (Kaiser et al. 2014) provide hardware-agnostic abstractions for communication, load balancing, and task scheduling in parallel computing; similar abstractions that facilitate the scheduling of agentic actions or simulation queries for large-scale MPI-style, either as directed by or to build a foundation model, would have value.
A primary function of foundation models is to compress the large corpus into a latent representation that supports multiple downstream tasks. DOE may play a valuable role developing open-source software tools supporting scientific inference from a pretrained latent space. For example, although machine-learned potentials have been widely successful, their implementation within production molecular dynamics simulators such as LAMMPS is often ad hoc, just-in-time–based, and suboptimal in performance. There is a need for a universal library that can distill these classes of data-driven computational kernels into performant, potentially Kokkos-accelerated modules that can be readily deployed in production codes. This opportunity extends beyond LAMMPS to any simulator that would extract data-driven models from a central, pretrained foundation model.
In the past year, agentic AI has surged as a means of using LLMs to launch external agents to explore hypotheses or improve/verify responses. DOE maintains a collective $407 million per year in open-source code (Shrivastava and Korkmaz 2024), with the Exascale Computing Project alone representing 70 distinct scientific codebases. There is a unique opportunity for DOE to expose automatic differentiation “hooks” in their open-source libraries to allow LLMs to couple directly to production codes, integrating robust numerical prediction into the training process. This would allow LLMs to both perform simulation and calculate loss functions to support holistic end-to-end training through reliable and mature DOE simulators. Several DOE codes already expose adjoints in this manner (see, e.g., MFEM), and so the initial software infrastructure is already in place. In addition to driving simulators in an agentic manner, there is also an opportunity to drive user facilities or autonomous “self-driving” laboratories that generate and process multimodal data. Although multimodal learning is of massive interest to industry, the breadth of modalities, in simulation (ranging from ab initio density functional theory to exascale Earth system models), in experiments (from tabletop X-ray measurements to massive user facilities), and into text (in the form of technical reports and classified journals) dwarfs the more focused efforts likely to be conducted by industry.
Conclusion 4-2: DOE is uniquely positioned to shape the future of AI-driven science. Material informatics and near-autonomous scientific platforms highlight the power of combining curated experimental data, simulation, and advanced AI to accelerate discovery. Federated computing and facility integration extend this vision by enabling distributed use of DOE’s infrastructure.
The curation and integration of specialized knowledge coupled with emerging multimodal and agentic AI approaches underscore the importance of preserving expertise, reasoning across diverse scientific data streams, and directly linking foundation models to DOE’s mature simulation ecosystem.
Recommendation 4-2: The Department of Energy should both modernize existing infrastructure and invest in new infrastructure to generate, curate, and facilitate the large data corpus necessary to build a scientific foundation model, including simulations to create data, high-throughput and/or autonomous experimental facilities, and facilities to host data. Additionally, they should create interfaces (e.g., agentic, retrieval-augmented generation tools) through which large foundational models may easily access these sources. A successful strategy will provide holistic access to multimodal or heterogeneous infrastructure across the entire DOE complex, mitigating the “stove-piping” of assets between different laboratories or departments.
The success of any DOE-wide foundation model initiative depends entirely on attracting and retaining top AI talent. This presents significant challenges, primarily due to intense competition from the private sector. Industry has rapidly accelerated its AI hiring, evidenced by a 21 percent increase in AI-related job postings from 2018 to 2023. Critically, employers are now prioritizing practical skill-based hiring over formal degrees. With AI competencies commanding a 23 percent wage premium—a value surpassing that of degrees up to the doctoral level (Bone et al. 2025)—and industry offering higher compensation and exceptional working conditions, DOE will need to compete for this essential expertise.
An added challenge that DOE faces arises from slow funding cycles that make it difficult to keep up with the pace of innovation in industry. Traditional DOE funding cycles, often spanning multiple years, can impede the rapid development and deployment of AI technologies. In contrast, industry laboratories frequently operate with more agile funding mechanisms, enabling quicker adaptation to emerging AI advancements. Within the National Laboratories, laboratory-directed research and development (LDRD)-based funding leads to a minimum
1-year lag to starting a project, which could be slow to the point of missing a major development completely. Furthermore, industry often has the resources to allow teams to solely focus on a single large-scale project, often for long periods on the order of years. To bridge this gap, DOE could consider implementing more flexible funding models, such as rolling proposals or seed grants, to accelerate AI research and development. DOE’s Office of Science maintains a number of large multi-institutional initiatives that may provide a vehicle to adapt more flexibly, for example, a Scientific Discovery Through Advanced Computing center, which has a broad enough scope and a sufficiently long-time horizon to adapt to rapid developments in the field while maintaining accountability to taxpayers.
To foster an environment conducive to AI innovation, DOE needs to cultivate a research culture that emphasizes flexibility and speed. This includes adopting performance metrics that prioritize real-world impacts, such as model robustness and deployment success, over traditional academic outputs such as publications. Encouraging interdisciplinary collaboration and providing recognition for contributions to AI systems and infrastructure can further enhance DOE’s competitiveness in the AI research landscape.
Despite challenges, DOE possesses unique strengths that can be leveraged to advance AI research and attract talent. DOE engages in mission-driven research; DOE’s focus on societal challenges, such as clean energy and national security, attracts scientists motivated by purpose-driven work. Furthermore, in contrast to industry, long-term career tracks within DOE foster sustained development of complex AI systems integrated with physical models. Finally, collaborations between physicists, chemists, computer scientists, and engineers enable the development of AI models that require domain-aware reasoning.
DOE’s infrastructure and expertise provide a solid foundation for AI-driven scientific discovery. Decades of investment in physics-based simulation codes offer valuable assets that AI can learn from or emulate. Robust, scalable software platforms developed by DOE laboratories can power hybrid workflows combining symbolic and neural reasoning. Scientific data sets from large-scale experiments serve as high-value training and validation sources for domain-specific AI. Furthermore, DOE’s supercomputers and user facilities provide superior computing capabilities and experimental data for training foundation models and deploying AI-augmented simulations.
A further issue is how DOE can best collaborate with universities. Building a strong academic pipeline is crucial for long-term AI capability in DOE. Some possible avenues for encouraging further collaboration with universities include:
Conclusion 4-3: DOE struggles to compete with the private sector for AI talent due to lower salaries and slow, traditional funding cycles. However, DOE’s unique strengths, such as its mission-driven work, long-term career paths, and powerful supercomputing infrastructure, can be leveraged to attract talent. Building a strong academic pipeline through closer collaboration with universities is also essential for its long-term success.
Recommendation 4-3: To maintain a top-tier workforce, the Department of Energy (DOE) should design leadership-scale scientific research programs and provide staff with opportunities to rapidly adapt to a quickly evolving technological landscape. To attract early-career scientists, DOE should be perceived as the best place to become a leader in scientific machine learning; while industry may lead large language model space, the unique access to state-of-the-art science can attract top talent. To be competitive with large-scale development efforts in industry, it is important to avoid fracturing of scientists’ time and attention. We recommend that DOE should create mechanisms by which medium through large teams can mount coordinated, focused efforts targeting mission-critical developments in fundamental research into, and applications of, foundation models for science.
DOE provides several open-source data repositories that serve the research community. These repositories are organized in a fragmented fashion across DOE subdomains (Table 4-1), each hosting heterogeneous data formats and sizes without a unified access interface. Many smaller data sets—often the output of single-investigator LDRD projects—reside on external curation platforms, further fragmenting access. Automated classifiers must inspect each data set for export-control restrictions, adding another layer of procedural complexity. Collected data sets typically represent final project outputs and omit intermediate simulations, classified results, and the metadata and documentation generated during data production.
TABLE 4-1 Department of Energy (DOE) Open-Source Data Repositories
| Name | Description |
|---|---|
| Open Data Catalog | Machine-readable list of all publicly available data sets maintained by DOE and its program and staff offices (https://www.energy.gov/data/articles/open-data-catalog). |
| DOE Data Explorer (OSTI/E-Link) | Portal for DOE-funded science and engineering data (https://www.osti.gov/dataexplorer). |
| Materials Data Facility | Publication and discovery service for materials data (Blaiszik et al. 2016; NETL 2024). |
| Earth System Grid Federation | Archive of climate model output and observations (Ananthakrishnan et al. 2007). |
| Joint Genome Institute Data Portal | Genomic and metagenomic data sets for bioenergy research (https://data.jgi.doe.gov). |
| Open Energy Information (OpenEI) | Wiki and repository of energy, resource, and policy data (https://openei.org/wiki/Main_Page). |
| Wind Integration National Dataset Toolkit | High-resolution wind power meteorology and output data (Draxl et al. 2015). |
| NREL Data Catalog | Photovoltaic system performance data (https://openei.org/wiki/PVDAQ). |
NOTE: NETL = National Energy Technology Laboratory; NREL = National Renewable Energy Laboratory.
DOE can address these challenges by establishing a centralized data center on the scale of its flagship supercomputing facilities. Such a center would offer extensive storage infrastructure, dedicated curation staff, and clear governance policies to enforce a consistent application programming interface for data hosting and retrieval for multimodal scientific data sets. A centralized data center could also help create interfaces not only to access data, but also to access potential foundation models. The easy access to the foundation models could be crucial for the scientific discovery cycle. It would also support research into best practices for data curation and the development of software tools tailored to ingesting large data sets into foundation models.
Conclusion 4-4: Although DOE curates many high-value data sets of value for construction of foundation models, they are typically developed in an ad hoc manner with heterogeneous file formats and data curation strategies that currently pose a barrier to high-throughput processing of data. Foundation models present a unique opportunity to address this issue.
Recommendation 4-4: To increase the success of future foundation models for science, the Department of Energy should invest in large-scale data user facilities (classified and unclassified), leveraged by artificial intelligence’s growing capability to interpret heterogeneous scientific data, similar to the successes experienced with previous investments in supercomputers, and open-source scientific computing libraries.
Ananthakrishnan, R., D.E. Bernholdt, S. Bharathi, D. Brown, M. Chen, A.L. Chervenak, L. Cinquini, et al. 2007. “Building a Global Federation System for Climate Change Research: The Earth System Grid Center for Enabling Technologies (ESG-CET).” Journal of Physics: Conference Series 78(1). https://doi.org/10.1088/1742-6596/78/1/012050.
Bachan, J., S. Baden, D. Bonachea, P. Hargrove, S. Hofmeyr, M. Jacquelin, A. Kamil, and B.v. Straalen. 2019. UPC++ Programmer’s Guide, v1.0-2019.3.0. Tech Report LBNL 2001191. Lawrence Berkeley National Laboratory.
Baker, N. 2024. “Unlocking a New Era for Scientific Discovery with AI: How Microsoft’s AI Screened Over 32 Million Candidates to Find a Better Battery.” Microsoft Azure Blog. https://azure.microsoft.com/en-us/blog/quantum/2024/01/09/unlocking-a-new-era-for-scientific-discovery-with-ai-how-microsofts-ai-screened-over-32-million-candidates-to-find-a-better-battery/?msockid=04adc0effe39670f1736d575ff946616.
Bauer, M., S. Treichler, E. Slaughter, and A. Aiken. 2012. “Legion: Expressing Locality and Independence with Logical Regions.” In SC ‘12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE. https://doi.org/10.1109/SC.2012.71.
Blaiszik, B., K. Chard, J. Pruyne, R. Ananthakrishnan, S. Tuecke, and I. Foster. 2016. “The Materials Data Facility: Data Services to Advance Materials Science Research.” JOM 68(8):2045–2052.
Bone, M., E. González Ehlinger, and F. Stephany. 2025. “Skills or Degree?” The Rise of Skill-Based Hiring for AI and Green Jobs.” Technological Forecasting and Social Change 214:124042.
Dingreville, R., N. Trask, B.L. Boyce, and G.E. Karniadakis. 2024. “Unlocking Alternative Solutions for Critical Materials via Materials Informatics.” The Bridge 55(2):46–54.
Draxl, C., B.M. Hodge, A. Clifton, and J. McCaa. 2015. Overview and Meteorological Validation of the Wind Integration National Dataset Toolkit. Technical Report NREL/TP-5000-61740. National Renewable Energy Laboratory.
EoP (Executive Office of the President of the United States). 2022. “National Strategy for Advanced Manufacturing.” https://www.energy.gov/sites/default/files/2024-03/National-Strategy-for-Advanced-Manufacturing-10072022.pdf.
Frey, N.C., I. Hötzel, S.D. Stanton, R. Kelly, R.G. Alberstein, E. Makowski, K. Martinkus, et al. 2025. “Lab-in-the-Loop Therapeutic Antibody Design with Deep Learning.” bioRxiv: 2025.2002.2019.639050.
Kaiser, H., A. Serio, T. Heller, D. Fey, and B. Adelstein-Lelbach. 2014. “HPX—A Task Based Programming Model in a Global Address Space.” In PGAS ‘14: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models. https://doi.org/10.1145/2676870.2676883.
Kale, L.V., and S. Krishnan. 1993. “CHARM++: A Portable Concurrent Object Oriented System Based on C++.” ACM SIGPLAN Notices 28(10):91–108.
MFEM. n.d. “Automatic Differentiation Mini Applications.” https://mfem.org/autodiff, accessed September 29, 2025.
NETL (National Energy Technology Laboratory). 2024. “Critical Minerals and Materials Program.” Program 141. https://netl.doe.gov/sites/default/files/2024-10/Program-141.pdf.
Nieplocha, J., R.J. Harrison, and R.J. Littlefield. 1994. “Global Arrays: A Portable ‘Shared-Memory’ Programming Model for Distributed Memory Computers.” In Supercomputing ‘94: Proceedings of the 1994 ACM/IEEE Conference on Supercomputing, pp. 340–349. IEEE.
Salinger, A.G., R.A. Bartlett, A.M. Bradley, Q. Chen, I.P. Demeshko, X. Gao, G.A. Hansen, et al. 2016. “Albany: Using Component-Based Design to Develop a Flexible, Generic Multiphysics Analysis Code.” International Journal for Multiscale Computational Engineering 14(4):415–438.
Shrivastava, R., and G. Korkmaz. 2024. “Measuring Public Open-Source Software in the Federal Government: An Analysis of Code.” Journal of Data Science 22(3):356–375.