Member agencies of the IC collect, maintain, and store large amounts of data. Today’s profusion of sensors and communications options has led to a dramatic escalation in the amount of data readily available for a wide variety of purposes. Much more data are being created than are being deleted. But a large fraction, as much as 30 to 35 percent of enterprise data, is “frozen” data—data that are rarely accessed (with weeks to years elapsing between uses) and may sit for a long period of time.6 This poses a challenge because it means that, worldwide, a massive number of exabytes (EB) of data need to be securely maintained, managed, and administered in some sort of way. Additionally, the amount of data that will need to be archived for short- and long-term use will continue to rise. In particular, the IC has the potential to be one of the largest customers for cold data storage because of its wide-ranging need for information.
Hard disk drives (HDDs), NAND-based flash solid-state drives (SSDs), and magnetic tape are today’s suitable candidate solutions for this ever-growing problem, but massive archives of data will change existing capacity and capability boundaries. Existing technologies such as magnetoresistive random-access memory (MRAM) and optical storage may also be appropriate for some IC use cases.
Emerging data storage technologies are also being researched, such as storage in ceramics, storage in silica, and storage in DNA or other biomolecules. Significantly more research and development (R&D) is needed before these technologies scale up and are considered proven. While progress is being made in these classes of technology, large-scale application is not likely in this decade.
In the near future, IC member agencies expect to maintain amounts of data at a scale comparable to that of a large corporation like Meta or Amazon. Most data will be in cold storage for a significant period of time—on the order of typical declassification timescales of 25–50 years. For the purpose of this discussion, archival data storage is defined as the storage of data for these long timescales required by the IC.
To help IC agencies plan program budgets and allocate resources, ODNI requested that the ICSB of the National Academies conduct a REC on technologies for archival data storage. Box 1 shows the statement of task for this REC.
This REC focuses primarily on current data storage technologies, particularly magnetic storage, Flash, MRAM, and optical storage. While not all of these technologies are well suited to archival storage applications, the IC sponsor indicated a desire to hear about all possible options. This REC does not address how data storage may change due to quantum computing approaches, as that will likely not occur until well after the sponsor’s 2030 timeframe. Brief overviews of emerging technologies are included. Finally, the authors offer comments on the importance of non-technological aspects of archiving, including policies and personnel, which should be considered in the design and acquisition of long-term data storage systems.
__________________
6 J. Monroe, 2023, “Ministering Our Dataverse: The Need for New Technologies,” presentation to the Intelligence Community Studies Board, September 18, Virtual, National Academies of Sciences, Engineering, and Medicine.
A small set of experts will provide a rapid expert consultation (REC) about technologies likely to be available in the period of 2026 to 2030 that can be used for massive archival data storage. The consultation will produce an individually authored perspective, drawn from existing information, on the following aspects:
During the 1970s, the National Aeronautics and Space Administration (NASA) developed a system to classify the maturity of technologies as they progress from basic research through development, prototyping, and acquisition processes. This system grouped technologies into nine technology readiness levels (TRLs), with a TRL of 1 being the least mature (“Basic Principles Observed and Reported”) and a TRL of 9 being the most mature (an actual system proven through successful operations). Since then, various governmental organizations have adopted their own specific definitions of the nine TRLs,7 but each follows the same basic structure. Table 1 shows the TRLs defined by the Department of Defense in 2011. The bulk of this REC focuses on archival data storage technologies with TRLs of 7 through 9, or technologies that at least have a working prototype in an operational environment or better. Emerging technologies that show some promise (with lower TRLs) will also be discussed briefly.
__________________
7 Government Accountability Office, 2020, “Technology Readiness Assessment Guide: Best Practices for Evaluating the Readiness of Technology for Use in Acquisition Programs and Projects,” Washington, DC.
| TABLE 1 Technology Readiness Levels (TRLs), 2011 Department of Defense Definition | ||
| TRL | Definition | Description |
| 1 | Basic principles observed and reported | Lowest level of technology readiness. Scientific research begins to be translated into applied research and development (R&D). Examples might include paper studies of a technology’s basic properties. |
| 2 | Technology concept and/or application formulated | Invention begins. Once basic principles are observed, practical applications can be invented. Applications are speculative, and there may be no proof or detailed analysis to support the assumptions. Examples are limited to analytic studies. |
| 3 | Analytical and experimental critical function and/or characteristic proof of concept | Active R&D is initiated. This includes analytical studies and laboratory studies to physically validate the analytical predictions of separate elements of the technology. Examples include components that are not yet integrated or representative. |
| 4 | Component and/or breadboard validation in laboratory environment | Basic technological components are integrated to establish that they will work together. This is relatively “low fidelity” compared with the eventual system. Examples include integration of “ad hoc” hardware in the laboratory. |
| 5 | Component and/or breadboard validation in relevant environment | Fidelity of breadboard technology increases significantly. The basic technological components are integrated with reasonably realistic supporting elements so they can be tested in a simulated environment. Examples include “high-fidelity” laboratory integration of components. |
| 6 | System/subsystem model or prototype demonstration in a relevant environment | Representative model or prototype system, which is well beyond that of TRL 5, is tested in a relevant environment. Represents a major step up in a technology’s demonstrated readiness. Examples include testing a prototype in a high-fidelity laboratory environment or in a simulated operational environment. |
| 7 | System prototype demonstration in an operational environment | Prototype near or at planned operational system. Represents a major step up from TRL 6 by requiring demonstration of an actual system prototype in an operational environment (e.g., in an aircraft, in a vehicle, or in space). |
| 8 | Actual system completed and qualified through test and demonstration | Technology has been proven to work in its final form and under expected conditions. In almost all cases, this TRL represents the end of true system development. Examples include developmental test and evaluation (DT&E) of the system in its intended weapon system to determine if it meets design specification. |
| 9 | Actual system proven through successful mission operations | Actual application of the technology in its final form and under mission conditions, such as those encountered in operational test and evaluation (OT&E). Examples include using the system under operational mission conditions. |
| SOURCE: Data from Department of Defense, 2011, “Technology Readiness Assessment (TRA) Guidance,” Arlington, VA. | ||