The landscape of artificial intelligence (AI) is undergoing a significant transformation driven by the emergence and evolution of foundation models. These models, typically large-scale neural networks trained on vast quantities of heterogeneous data, represent a departure from traditional AI systems designed for specific tasks. Foundation models possess the capacity to generate findings and discern patterns within extensive data sets with data volumes that exceed by orders of magnitude the computing and storage capacities of traditional solvers and even previous machine learning models.
Key characteristics defining foundation models include the following:
These characteristics position foundation models as a potential paradigm shift1 for scientific research with a concomitant impact on the Department of Energy’s (DOE’s) mission.
In evaluating the roles of foundation models for scientific discovery, a natural early question is whether they present stand-alone alternatives to the modeling approaches that preceded them. We examine this perspective in the current section.
The key strengths of foundation models lie in adaptability, generalizability, scalability, and their capacity for multimodal integration. Foundation models can seamlessly combine multiple data modalities—including numerical simulation outputs, experimental sensor data, textual documentation, images, and videos—into unified representational frameworks. This unique capability makes them particularly well suited for fields such as life sciences, materials science, fluid dynamics, weather forecasting, and energy systems, where data complexity and heterogeneity pose significant challenges to traditional methods.
The scalability of foundation models, supported by large-scale computational resources, allows them to uncover complex patterns and interactions within massive data sets. This results in accelerated discovery and improved predictive performance in multifaceted scientific scenarios (Bodnar et al. 2025). Their generalized learning mechanisms further enable deployment across diverse operational contexts without requiring extensive manual reprogramming. In environments such as DOE facilities, this adaptability can lead to more dynamic and responsive control systems, enhancing operational efficiency and resilience in the face of evolving conditions.
Within DOE’s computational science program, foundation models bring two particularly valuable advantages. The first centers on spatiotemporal foundation models—transformer-based architectures pretrained on large data sets derived
___________________
1 A paradigm shift in this context means a fundamental change in how scientific research is conducted, driven by the introduction of foundation models.
from high-fidelity simulations of multiphysics systems described by partial differential equations. These models can forecast spatiotemporal solutions, aligning with one of the core goals of scientific machine learning: extending the capabilities of traditional, computationally expensive, discretization-based solvers. Spatiotemporal foundation models offer dramatic reductions in computational cost, enabling large-scale or long-time simulations at up to five orders of magnitude less computational effort. This has been compellingly demonstrated in domains such as Earth system modeling (Bodnar et al. 2025).
Although spatiotemporal foundation models may not yet achieve the accuracy of equation-based solvers, they offer a compelling trade-off through fine-tuning. When pretrained on a broad range of physics, these models can be adapted to new physical systems not present in the original training data. Remarkably, this transfer learning often yields better results than training the model from scratch on a single, narrow domain. Thus, spatiotemporal foundation models not only function as efficient solvers but also provide a scalable framework for generalizing across physical phenomena—an invaluable capability in a wide-ranging computational science program. Examples include multiphysics pretraining (McCabe et al. 2023) and co-domain neural operators (Rahman et al. 2024).
A second and perhaps even more intriguing potential lies in the inference of emergent physics. Because of the underlying transformer architecture—specifically the use of attention mechanisms and the capacity to learn contextual relationships over long pretraining epochs—these models may begin to reveal new physical findings or discoveries. They could go beyond simply generating solutions to explain the emergence of features in space and time, such as why vortex structures emerge in certain regions of a flow at specific times, or how macroscale material failure occurs as a consequence of microcrack and dislocation interactions. Such tasks are central to the roles of computational physicists. This possibility becomes even more plausible when spatiotemporal foundation models are integrated with large language models (LLMs) into multimodal systems (Ashman et al. 2024). Such combinations may bridge the gap between predictive modeling and interpretive reasoning, bringing us closer to models that not only solve complex physical systems but also explain them.
In the context of DOE applications, traditional models refer to large-scale computational science solvers as well as statistical models. The solvers include finite element, finite difference, finite volume, and spectral methods and related numerical techniques. Over decades, partnerships between DOE and computational science researchers at U.S. universities have fostered the development of a robust ecosystem of discretization-based solvers. Supported by DOE’s Advanced Scientific Computing Research and DOE’s National Nuclear Security Adminis-
tration Advance Simulation and Computing programs, this effort has produced vast suites of high-performance scientific software, much of it pioneered within DOE laboratories. Examples include the Trilinos Project, Dakota, and MFEM (Adams et al. 2020; Anderson et al. 2021; Heroux et al. 2005).
This ecosystem enables the modeling and simulation of a wide range of multiphysics problems relevant to DOE missions. It has evolved to support computations at the exascale and beyond, laying a firm foundation for applying computational science to complex, large-scale problems in physics, energy, Earth systems, and national security. As machine learning and artificial intelligence have grown in prominence, DOE-supported computational frameworks have begun to incorporate these data-driven methods, enriching traditional modeling approaches without discarding them.
A core strength of discretization-based solvers is their ability to deliver high-fidelity solutions that accurately represent the underlying physics—bounded mainly by the numerical algorithms and available computing power. These solvers explicitly encode conservation laws (e.g., energy, mass, momentum), thermodynamic consistency, and convergence properties, ensuring that model predictions are transparent, interpretable, and physically grounded. Such fidelity, however, comes at a cost: these models often demand significant computational resources, especially for large spatial domains or long-time horizons.
Despite the emergence of foundation models, traditional physics-based models retain critical advantages, particularly in interpretability, reliability, and strict adherence to physical laws. They are accompanied by rigorous verification, validation, and uncertainty quantification frameworks essential for DOE’s high-stakes applications—such as nuclear reactor safety, weapons stewardship, and other national security tasks. These frameworks ensure compliance with safety, regulatory, and quality standards, which remain challenging for purely data-driven foundation models to satisfy. Furthermore, foundation models have yet to demonstrate generalizability across geometries, initial and boundary conditions and transitions such as phase changes, laminar-to-turbulent flow, shock formations, and material failure. These are standard for advanced discretization-based solvers.
In addition to being more amenable to interpretation and to the quantification of their uncertainty, traditional models often require less computational overhead for model development and deployment compared to the extensive pretraining and fine-tuning phases of foundation models. (However, geometry and mesh generation can prove time-consuming, and the expense of large direct numerical simulations is a well-recognized limitation.) Furthermore, traditional models play a foundational role in the data ecosystem—they are often required to generate the high-quality data used to train, fine-tune, or validate foundation models.
Another powerful advantage of traditional approaches lies in their ability to be integrated into statistical modeling frameworks. In many settings, physics-based models, of moderate or lower fidelity, can be embedded within Bayesian hierarchical structures to facilitate efficient uncertainty quantification.
Integrating traditional modeling approaches with foundation models offers transformative potential for DOE’s scientific enterprise. Traditional computational methods—such as finite element, finite volume, and spectral solvers—have formed the bedrock of high-fidelity simulations, enabling predictive science across complex domains such as materials physics, turbulent fluid flow, Earth systems modeling, and nuclear systems, as outlined above. These models are grounded in well-understood physical laws and verification and/or validation protocols, making them indispensable for safety-critical and regulatory-constrained applications. However, they come with significant computational demands, particularly for large-scale or long-time simulations.
The committee reiterates that by contrast, foundation models trained on vast multimodal data sets—including simulation results, sensor data, imagery, and scientific literature—offer scalability, generalizability, and data-driven adaptability. Rather than viewing foundation models as replacements for traditional methods in computational science, the committee advocates for a synergistic integration of the two (Koumoutsakos 2024). Hybrid modeling strategies can fuse the physical interpretability and robustness of classical solvers with the efficiency and learning capabilities of foundation models, particularly in multiscale, multiphysics applications where stand-alone approaches often fall short.
Foundation models can significantly enhance the entire research life cycle at DOE national laboratories and user facilities through multiple avenues of hybridization:
Foundation model development is progressing toward augmenting traditional simulations by learning data-driven corrections to reduced-order models. For example, foundation model–based closure approximations in turbulence and combustion science could improve fidelity, while in nuclear and Earth systems modeling, they could enhance accuracy and enable rigorous uncertainty quantification:
The fusion of foundation models with traditional numerical methods represents more than a computational advance: it constitutes a paradigm shift in how scientific discovery is conducted. By combining rigorous physical modeling with the adaptive learning capabilities of modern AI, this hybrid approach opens the door to faster, more accurate, and more autonomous science.
From accelerating simulations to enabling real-time experimental feedback and automating hypothesis generation, the integration of foundation models into DOE’s computational and experimental ecosystem promises to reshape the pace and scope of scientific innovation (Bodnar et al. 2025; Herde et al. 2024; McCabe et al. 2023; Nguyen et al. 2023; Ye et al. 2024, 2025).
Conclusion 2-1: Integrating traditional models with foundation models is proving to be increasingly powerful and has significant potential to advance computational findings in the physical sciences. These hybrid methods leverage the physical interpretability and structures of classical computational approaches alongside the data-driven adaptability of foundation models. This integration enables the modeling of
complex multiphysics, multiscale, and partially observed (understood) systems that challenge traditional approaches both computationally and mathematically.
Recommendation 2-1: The Department of Energy (DOE) should invest in foundation model development, particularly in areas of strategic importance to DOE, including areas where DOE already has advantages leveraging its unique strengths in those domains. DOE should also prioritize the hybridization of foundation models and traditional modeling. Such hybrid modeling strategies can fuse the physical interpretability and robustness of classical solvers with the efficiency and learning capabilities of foundation models, particularly in multiscale, multiphysics applications where traditional approaches have limitations in capturing the heterogeneity, complexity, and dynamics of the physical system. DOE should not, however, abandon its expertise in numerical and computational methods and should continue investing strategically in software and infrastructure.
Adams, B.M., W.J. Bohnhoff, K.R. Dalbey, M.S. Ebeida, J.P. Eddy, M.S. Eldred, R.W. Hooper, et al. 2020. Dakota, a Multilevel Parallel Object-Oriented Framework for Design Optimization, Parameter Estimation, Uncertainty Quantification, and Sensitivity Analysis: Version 6.13 User’s Manual. Sandia National Laboratories. https://www.sandia.gov/app/uploads/sites/241/2023/03/Users-6.13.0.pdf.
Anderson, R., J. Andrej, A. Barker, J. Bramwell, J.S. Camier, J. Cerveny, V. Dobrev, et al. 2021. “MFEM: A Modular Finite Element Methods Library.” Computers and Mathematics with Applications 81:42–74.
Ashman, M., C. Diaconu, E. Langezaal, A. Weller, and R.E. Turner. 2024. “Gridded Transformer Neural Processes for Large Unstructured Spatio-Temporal Data.” arXiv:2410.06731, https://ui.adsabs.harvard.edu/abs/2024arXiv241006731A, accessed October 1, 2024.
Bodnar, C., W.P. Bruinsma, A. Lucic, M. Stanley, A. Allen, J. Brandstetter, P. Garvan, M. Riechert, J.A. Weyn, H. Dong, J.K. Gupta, K. Thambiratnam, A.T. Archibald, C.C. Wu, E. Heider, M. Welling, R.E. Turner, and P. Perdikaris. 2025. “A Foundation Model for the Earth System.” Nature 641(8065):1180–1187.
Brodnik, N.R., C. Muir, N. Tulshibagwale, J. Rossin, M.P. Echlin, C.M. Hamel, S.L.B. Kramer, T.M. Pollock, J.D. Kiser, C. Smith, and S.H. Daly. 2023. “Perspective: Machine Learning in Experimental Solid Mechanics.” Journal of the Mechanics and Physics of Solids 173:105231. https://doi.org/10.1016/j.jmps.2023.105231.
DARPA (Defense Advanced Research Projects Agency). 2025. “DIAL: Mathematics for the Discovery of Algorithms and Architectures.” https://www.darpa.mil/research/programs/mathematics-for-the-discovery-of-algorithms-and-architectures, accessed July 31, 2025.
Gottweis, J., W.-H. Weng, A. Daryin, T. Tu, A. Palepu, P. Sirkovic, A. Myaskovsky, et al. 2025. “Towards an AI Co-Scientist.” arXiv:2502.18864 (eprint). https://ui.adsabs.harvard.edu/abs/2025arXiv250218864G.
Herde, M., B. Raonić, T. Rohner, R. Käppeli, R. Molinaro, E. de Bézenac, and S. Mishra. 2024. “Poseidon: Efficient Foundation Models for PDEs.” arXiv:2405.19101. https://ui.adsabs.harvard.edu/abs/2024arXiv240519101H.
Heroux, M.A., R.A. Bartlett, V.E. Howle, R.J. Hoekstra, J.J. Hu, T.G. Kolda, R.B. Lehoucq, et al. 2005. “An Overview of the Trilinos Project.” ACM Transactions on Mathematical Software 31(3):397–423.
Koumoutsakos, P. 2024. “On Roads Less Travelled Between AI and Computational Science.” Nature Reviews Physics 6(6):342–344.
McCabe, M., B. Régaldo-Saint Blancard, L. Holden Parker, R. Ohana, M. Cranmer, A. Bietti, M. Eickenberg, et al. 2023. “Multiple Physics Pretraining for Physical Surrogate Models.” arXiv:2310.02994. https://ui.adsabs.harvard.edu/abs/2023arXiv231002994M (last revised December 10, 2024).
Nguyen, T., J. Brandstetter, A. Kapoor, J.K. Gupta, and A. Grover. 2023. “ClimaX: A Foundation Model for Weather and Climate.” arXiv:2301.10343. https://ui.adsabs.harvard.edu/abs/2023arXiv230110343N.
Rahman, M.A., R.J. George, M. Elleithy, D. Leibovici, Z. Li, B. Bonev, C. White, et al. 2024. “Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs.” arXiv:2403.12553. https://doi.org/10.48550/arXiv.2403.12553.
Sakana.AI. 2024. “The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery.” https://sakana.ai/ai-scientist.
Skarlinski, M., T. Nadolski, J. Braza, R. Storni, M. Caldas, L. Mitchener, M. Hinks, A. White, and S. Rodriques. 2025. “FutureHouse Platform: Superintelligent AI Agents for Scientific Discovery.” FutureHouse. https://www.futurehouse.org/research-announcements/launching-futurehouse-platform-ai-agents.
Ye, Z., X. Huang, L. Chen, H. Liu, Z. Wang, and B. Dong. 2024. “PDEformer: Towards a Foundation Model for One-Dimensional Partial Differential Equations.” arXiv:2402.12652. https://ui.adsabs.harvard.edu/abs/2024arXiv240212652Y.
Ye, Z., Z. Liu, B. Wu, H. Jiang, L. Chen, M. Zhang, X. Huang, et al. 2025. “PDEformer-2: A Versatile Foundation Model for Two-Dimensional Partial Differential Equations.” arXiv:2507.15409. https://ui.adsabs.harvard.edu/abs/2025arXiv250715409Y.