In recent decades, significant advances in science and technology have greatly enhanced understanding of the many biological molecules that contribute to life’s complexity. RNA is one such molecule that has captured attention for its remarkable diversity and important role in critical life processes. The DNA that encodes human genes safely stores the “blueprint” for life in every cell of the body. DNA does not travel, but instead passes its information through RNA for delivery to the places where its information is needed. Through exquisite natural biological pathways, RNA is edited and revised to give rise to hundreds, in some cases thousands, of distinct RNA molecules for each gene. In this way, RNA molecules allow diversification of the information encoded by DNA, which is essential for the maintenance and survival of complex organisms, including humans.
RNA also acquires modifications during its life cycle, leading to further diversification. As this report will highlight, RNA modifications can change the three-dimensional shape of an RNA molecule, alter the abundance of specific RNAs, and promote or disrupt interactions between an RNA and other biological molecules that are critical for sustaining life. It has become abundantly clear that disruption of the cellular machinery responsible for editing and modifying RNA can lead to a wide range of human ailments, including neurodevelopmental disorders, neurodegenerative diseases, heart disease, autoimmune diseases, cancer, and diabetes (see Box 1-2 in Chapter 1 of the main report and Table 2-1 in Chapter 2). A better understanding of RNA modifications and their linkage to disease will lead to more targeted and personalized medical treatments, which in turn will advance human health.
Although much remains to explore and understand about RNA modifications in biological systems, what is known about these molecular marvels has already been leveraged to produce useful biotechnologies, such as vaccines and medicines. The most recent and prominent example is the vaccines that saved millions of human lives worldwide during the COVID-19 pandemic. The RNA modification N1-methylpseudouridine was instrumental in developing highly effective and safe messenger RNA (mRNA) vaccines against COVID-19 (see Box 2-1 in Chapter 2). The 2023 Nobel Prize in Physiology or Medicine was awarded to the researchers whose early work studying
___________________
1 References are not included in the report summary. Citations appear in subsequent report chapters.
RNA modifications for use in mRNA vaccines laid the foundation for this critical breakthrough. Undoubtedly, RNA modifications will continue to be used to expand mRNA vaccines for other pathogens, such as HIV, influenza, and bacteria. Understanding how bacteria and viruses use RNA modifications to evade the human immune response and cause disease will also help unlock new therapies. For example, antivirals and antibiotics that target or exploit these evasive features hold great promise for protecting human lives from harmful infections.
Prevention of infectious diseases is just one example of a valuable outcome resulting from research dedicated to understanding RNA modifications. Using technology similar to that of the COVID-19 vaccines, mRNA vaccines against cancer are currently undergoing clinical trials. In addition, RNA-based therapeutics that incorporate RNA modifications are used to treat rare diseases, such as spinal muscular atrophy (see Figure 2-3 in Chapter 2). RNA modifications also enhance the efficacy of guide RNAs used in the gene-editing technique CRISPR, a promising tool for gene therapy that received the 2020 Nobel Prize in Chemistry. Beyond health and medicine, RNA modifications show exciting promise for enhancing agricultural productivity. Preliminarily, engineering of RNA modifications has demonstrated improved crop yields and drought resistance in potatoes (see Figure 2-4 in Chapter 2). Such engineering has the potential to improve food security for billions of people across the globe. Finally, early research also suggests that understanding RNA modifications will spur growth in synthetic biology applications and nanotechnology through the development of more stable and controllable biological pathways and gene expression tools, as well as the expansion of the molecular design space as building blocks for nanostructures.
Altogether, leveraging knowledge about RNA modifications is expected to greatly boost the bioeconomy. For example, a recent report projected the bioeconomy to generate about $4 trillion per year in economic impact over the next 10 to 20 years,2 with health and medicine and food and agriculture being major contributing sectors. The COVID-19 vaccines alone, by reducing excess mortality, have an estimated economic value of $3.09 trillion in the United States, according to the National Bureau of Economic Research. Clearly, the return on investment in RNA modifications research holds immense potential.
Conclusion 1: RNA modifications are a critical but underexplored area of research. A more complete understanding of RNA modifications will be important for significantly advancing the fundamental knowledge of living systems; maintaining the health of humans, plants, animals, and the environment; preventing and treating disease; improving crop yields and resilience; stimulating the bioeconomy; and addressing other issues of societal importance.
Comprehensively sequencing RNA and all of its modifications is expected to have profound implications: improving public health, securing food supply, contributing to the U.S. economy, and reinforcing the nation’s competitive stance in global scientific and technological innovation. Yet, RNA science stands at a critical crossroads. Despite notable progress and economic value in this growing field, much about how RNA modifications affect the fate and function of RNA molecules in living systems is still unknown. Existing technologies cannot currently discover all RNA modifications, let alone comprehensively sequence them on every RNA molecule. This limitation significantly hampers the ability to study and leverage RNA modifications to address current and emerging societal issues.
In this report, the committee proposes a roadmap of innovation and advances that will enable any RNA from any biological system to be sequenced with all of its modifications. Ultimately, with
___________________
2 This paragraph was modified after release of the report to the study sponsor to correct the estimated economic impact of the bioeconomy as described in McKinsey & Company, 2020. See https://www.mckinsey.com/industries/life-sciences/our-insights/the-bio-revolution-innovations-transforming-economies-societies-and-our-lives (accessed March 18, 2024).
this capability, the committee envisions that RNA modification profiles for diseases will lead to more personalized and targeted treatments to advance human health. Access to information about any modification on any RNA will also enable the creation of other useful and valuable biotechnologies and instigate transformative changes across various sectors, beyond health and medicine.
At various stages of the Human Genome Project (HGP), it was considered newsworthy and often amusing to consider whose DNA should be sequenced, even though the DNA genomes of different individuals are nearly identical, differing by about 0.4 percent. However, while every cell in an individual generally has the same genomic sequence, the processing of RNA from each gene is amazingly diverse and dynamic. The collective set of RNA molecules and their modifications, or the “epitranscriptome,” varies between cell types and tissues so that the RNA can meet specific demands—for example, to specify muscle or skin. Further diversity arises from factors such as age, sex, and environment. Recent studies in plants suggest that the epitranscriptome literally changes with the weather.
The HGP’s goal of providing a complete reference genome for humans and model organisms was appropriate. Similarly, “reference” epitranscriptomes that represent specific cell types and conditions in humans, viruses, and a set of model organisms will be invaluable. However, because there are so many epitranscriptomes to determine, even for a single organism or individual, the ultimate and most impactful goal for an epitranscriptome initiative will be to enable sequencing of any epitranscriptome by developing the necessary technologies and associated infrastructure. This advance in capabilities would allow for any epitranscriptome, under any cellular condition or context, to be generated well into the future.
Conclusion 2: Sequencing the vast array of RNA molecules and discovering all modifications in all of their positions under various conditions and cellular states exceeds the challenge of the HGP. Because there are many important epitranscriptomes to determine, developing technology and infrastructure to enable the determination of any epitranscriptome will be the most impactful goal.
Unlocking any epitranscriptome will require several efforts to occur in parallel (Figure S-1). This report outlines the steps to achieving the technological and infrastructural advances necessary to enable the comprehensive sequencing of RNA in human cells and other organisms, marking a pivotal advance in the understanding and practical application of RNA science.
Although it is difficult to predict the exact trajectory of technological advancement, other large-scale efforts, such as the HGP, have shown that focused and concerted organization and funding directed toward a set of well-defined goals will accelerate technological innovation in a field. Other factors that contributed to the success of the HGP were the coordination of efforts among federal agencies and international partners, and the built-in public–private collaboration and competition (see Box 5-1 in Chapter 5). Inspired by the success of the HGP, the United States has since led several other large-scale efforts in the life sciences, such as the Glycoscience Program, the Human Microbiome Project, and the BRAIN Initiative.3 While each initiative is unique in terms of goals and approach, several commonalities link successful initiatives (see Chapter 5).
___________________
3 BRAIN stands for Brain Research Through Advancing Innovative Neurotechnologies.
Conclusion 3: Large-scale, coordinated efforts in the life sciences—such as the HGP, the Glycoscience Program, the Human Microbiome Project, and the BRAIN Initiative—have proven vital in driving innovation in science and technology. Such efforts hold value in their ability to align federal agencies; support public–private partnerships; organize consortia; fund individual laboratories; and prioritize closing gaps in the areas of technology development, synthesis of standards, infrastructure buildout, workforce training, and public awareness.
Even as the importance of epitranscriptomics for addressing a range of challenges across sectors has become clear, progress has been hampered by gaps in sequencing technology and other areas. A focused, large-scale effort is essential to accelerate technological innovation and realize the full promise of the field. Importantly, a clear-sighted strategy and dedicated funding and resources will be needed if the United States is to remain in the vanguard of scientific progress. The United States has begun to invest in RNA modifications research and the capabilities to sequence and study them, with essential leadership from the National Human Genome Research Institute (NHGRI), National Institute of Environmental Health Sciences (NIEHS), National Science Foundation (NSF), and private entities such as the Warren Alpert Foundation and the Margot and Thomas Pritzker Family Foundation. However, there is currently no overarching, strategic U.S. government effort or coordinated investment in the study, sequencing, and mapping of RNA modifications. Like the United States, several countries are funding research in this area, with a few leading focused initiatives. Germany has dedicated significant public funding to RNA modifications, and Australia and Canada are also making major investments in this space, with efforts focused on RNA chemistry and RNA-based therapeutics. U.S. global leadership in this space will hinge on a significant whole-of-government effort and proactive identification of areas for international cooperation.
Conclusion 4: A large-scale effort focused on epitranscriptomics is needed to accelerate technological innovation and scientific progress in the field. Such an effort will require expertise spanning multiple scientific disciplines (e.g., engineering, computer science, life science, social science) and will impact several sectors (e.g., health, agriculture). An endeavor of this scale and scope will entail a substantial investment of time and resources. Dedicated funding to key federal entities—such as NSF, the National Institutes of Health (NIH), the National Institute of Standards and Technology (NIST), the Department of Defense (DOD), and the Department of Energy (DOE)—is critical to enhance their ability to work with academia, industry, philanthropic organizations, and international partners in driving innovation towards sequencing RNA and its modifications and ensuring translation of the resulting scientific breakthroughs into advancements including new, effective biotechnology products.
Recommendation 1: An established oversight body, such as the Office of Science and Technology Policy or a similar entity with appropriate breadth and authority, should catalyze and coordinate efforts supporting a large-scale epitranscriptomics initiative to ensure effective use of resources and minimize duplication. Expertise from the health, agriculture, commerce, energy, national security, and defense sectors will be required. Both research and regulatory agencies should be included as a part of the effort. An implementation plan should be developed and include support for agencies to work with partners in academia, industry, scientific societies, private foundations, international partners, and other relevant groups. The coordinating body should be responsible for strategic coordination of government, academic, and industry partners. The implementation plan should embrace conclusions and recommendations from the committee and aim to do the following:
Ongoing research in the field of RNA modifications needs to continue and expand. Despite increased recognition of the importance of RNA modifications in health and their broad application potential for diagnosis, treatment, and prevention of disease, numerous gaps remain in the understanding of the regulation and function of these modifications. Fundamental research will be critical for identifying the locations of known modifications, discovering new modifications, and uncovering the functional importance of every modification. As new knowledge about RNA modifications is acquired, the importance of RNA modifications to modern biology will become even more apparent and new applications will emerge.
Conclusion 5: Discovery efforts and fundamental research in the field of epitranscriptomics will reinforce the importance and impact of RNA modifications and fuel technological advances that will improve scientists’ ability to sequence them. New funding mechanisms, public and private, that encourage collaboration, spur innovation, and increase interest in RNA modifications will be critical.
Currently, the tools and technologies available for sequencing RNA and its modifications can identify and map only a small subset of the more than 170 known RNA modifications. Thus there is a critical need in the field to improve the sensitivity, specificity, and throughput of technologies that currently exist, and to explore new and emerging instrumentation and methodologies for enhancing capabilities for sequencing RNA modifications and determining their abundance and stoichiometry. A key milestone on the way to this goal is establishing the ability to sequence RNA molecules from end to end, to preserve information about all their modifications.
Conclusion 6: The current tools, technologies, and methodologies for end-to-end sequencing of RNA and all of its modifications are insufficient. The field of RNA biology will be driven forward by improving upon existing approaches and advancing new technologies that are robust and quantitative, and that preserve the information of full-length RNAs.
Conclusion 7: Improving the sensitivity of methodologies for cataloging and quantifying all RNA modifications in a sample, even without positional information, is an important enabling step that will inform the development of future RNA sequencing technologies and facilitate discovery of additional RNA modifications. Achieving this crucial intermediate goal will be spurred by an expanded repertoire of modified nucleosides for use as reference standards and more sensitive instrumentation.
Conclusion 8: Efforts directed toward enabling end-to-end sequencing of RNA and its modifications will accelerate innovation in the life sciences research enterprise but will also pave the way for developing new biotechnologies (e.g., biotherapeutics, vaccines, diagnostics, nanomaterials) and novel approaches that open new doors in life sciences research and other areas that are not yet apparent.
Public and private investment and partnership will be important for advancing epitranscriptomic capabilities. As noted above, NHGRI, NIEHS, NSF, and private foundations such as the Warren Alpert Foundation have already demonstrated leadership and investment in RNA modifications research. The Defense Advanced Research Projects Agency Biological Technologies Office and the new Advanced Research Projects Agency for Health, under DOD and NIH, respectively, offer other potential loci for funding, given their focus on research aimed at solving problems of great practical importance with high-risk, high-impact projects. Investment by these and other U.S. federal entities, such as DOE, will directly enable the leveraging of knowledge gained for biotechnology and manufacturing applications across all major sectors, which is a stated priority for the U.S. government. Given that advances in computational methods are critical to achieving the goals set out in this report, specific calls for funding to support collaborative initiatives that integrate experimental and computational research and development must be prioritized.
Recommendation 2: Federal funders of research—such as the National Institutes of Health, National Science Foundation, Department of Defense, and Department of Energy—should invest in and prioritize (a) addressing limitations and closing gaps in the existing tools and technologies available for epitranscriptomics, (b) exploring new and emerging approaches, and (c) compiling and centralizing resources pertaining to available tools and methods. Existing tools and technologies should be refined and optimized, and novel approaches to characterize RNA modifications should be explored, with the goal of enabling end-to-end sequencing of RNA and its modifications. Information about available experimental methods and associated computational approaches should be compiled and centralized as a resource that researchers can use to understand the utility, biases, strengths, and weaknesses of different methods and tools. It will be critical to use a diversity of funding mechanisms and models, and to encourage and support collaborative initiatives that integrate experimental and computational components.
A concerted investment of time, effort, and funding by key public and private groups will lead, within 15 years, to sensitive and specific technologies, methods, and computational tools capable of identifying and determining the location and abundance of all RNA modifications in a single experiment (Figure S-2). Such tools would enable interrogation of whole epitranscriptomes of individual RNA isoforms and at the single-cell level, revealing unprecedented insights into the influence of modifications on RNA folding, stability, and function.
In the interim, small and high-abundance RNAs, such as transfer RNAs (tRNAs), could represent accessible early targets for developing and evaluating new tools and technologies. Likewise, epitranscriptomes of several high-importance viral pathogens would be a valuable practical payoff for the first phase of this effort. Soon after, complete epitranscriptomes for human cultured cells and tissues may be in reach, and eventually multicellular eukaryotes. Enhanced capabilities can one day enable clinical samples to be studied, in order to understand disease and generate personalized treatments.
Several types of standards are needed to support research and technology development for the RNA modifications field. A broader collection of modified RNAs with known sequence and modification stoichiometry is needed to compare results collected using different methods, within and between labs. Although synthetic routes for generating modified nucleosides exist in many cases, few commercial or nonprofit sources are available for these standards. The lack of market forces supporting the creation of such materials at scale emphasizes the need for federal involvement and leadership. Developing, curating, and promoting standards for epitranscriptomics falls directly within NIST’s mission to “promote U.S. innovation and industrial competitiveness by advancing measurements in science, standards, and technology toward economic security and improved quality of life.” Therefore, NIST is well suited to take the lead for this component of the initiative.
In addition, clear and consistent standards for data and databases—such as universal nomenclature; guidelines for raw data deposition and exchange; and standards for data quality, validation, and analysis—need to be established to facilitate data access and sharing. The National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine at NIH, develops and promotes standards for databases, data deposition and exchange, and biological nomenclature as a part of its responsibilities. Therefore, NCBI is naturally positioned to take on the role of establishing data and database standards for epitranscriptomics. Given the overlap in responsibilities and the unique expertise of each institute, it may be beneficial for NCBI and NIST to partner or collaborate on this endeavor.
Conclusion 9: Several types of standards are needed, specifically (a) technology-agnostic modified RNA reference materials that enable assay validation and cross-referencing of approaches, (b) data standards around nomenclature and clear guidelines for data deposition and exchange, and (c) robust and sustainable platforms for the curation and indexing of vast amounts of RNA data.
Recommendation 3: The National Institute of Standards and Technology should develop, curate, and promote standards to support the field of epitranscriptomics. Specifically, modified RNA reference materials should be developed with a focus on making them widely available and affordable.
Recommendation 4: The National Center for Biotechnology Information should establish and promote standards for databases, data deposition and exchange, and nomenclature for RNA modifications.
If the recommendations are followed, the committee envisions that within 15 years, affordable oligonucleotides of any custom-order sequence, length, modification stoichiometry, and structure could be readily available for research and technology development. Data and database standards could enable seamless access to universally available and shareable information on RNAs of any
biotype from any defined set of cellular conditions, including their modification status and their biological, medical, and functional properties. The committee developed milestones and deliverables to achieving this vision (Figure S-3).
Easy access to reliable, accurate, and up-to-date information about RNA modifications is key to advancing epitranscriptomics. Although many databases store information on modified RNAs, they are largely managed by individual research labs or may have a narrow focus on a particular modification or specific type of RNA. While these RNA databases have been vital to advancing the field of RNA biology, a major concern is the loss of resources (e.g., funding, staff) leading to a lack of maintenance or abandonment of carefully curated databases. This poses the risk of limiting scientific growth and understanding, while wasting time, effort, and resources. Centralized administration of RNA data resources is needed to ensure sustainable funding and stability, so that the current knowledge on RNA modifications can be integrated and new modifications can be
indexed. NIH is well-suited to establish, maintain, and coordinate databases for RNA modifications data. Specifically, housing epitranscriptomic data falls squarely within the mission of NCBI, which aims to develop new information technologies to aid in the understanding of health and disease.
Conclusion 10: The prevalence of “home-grown,” small-group-supported RNA databases has been vital to advancing the field of RNA biology. Nonetheless, a major concern is the loss of resources (e.g., funding, staff) leading to a lack of maintenance of these laboratory-housed databases. Abandoning carefully curated databases may limit scientific growth and understanding, and waste time, effort, and resources.
Recommendation 5: The National Institutes of Health (NIH) should establish and maintain a sustainably funded, stable, integrated, and centrally managed database (or ensemble of databases) that is a long-lasting and always-current source of curated information about RNAs and their modifications. Such a resource could build upon, through mirroring or linkage, existing well-maintained databases that contain valuable information related to RNA modifications. Efforts to develop such centralized databases should strive to provide accurate, single-molecule, end-to-end information on RNA modifications. NIH should initiate U.S. collaboration with other countries invested in research on RNA and its modifications. In consultation or partnership with the National Institute of Standards and Technology, standards for the deposition and exchange of experimental raw data should be developed and promoted according to FAIR (findability, accessibility, interoperability, and reusability) principles to ensure data in the field of RNA modifications are accessible, well maintained, and user friendly.
While the components previously listed will help drive innovation, other factors are essential for transforming the field of RNA modifications. These additional drivers include a well-trained and informed workforce; centralized facilities suited to specialized tasks; readily available, high-quality reagents and research materials; organization and coordination at the national and international levels; and a supportive policy environment (discussed in Chapter 5). Especially important to ensuring the success of a coordinated, large-scale effort in the RNA modifications field is a workforce that is well equipped to address new scientific and technical opportunities and challenges creatively. This workforce will comprise both retrained professionals and the next generation of scientists.
Inclusion of topics highlighting the societal importance of RNA—such as the role of RNA and its modifications in key molecular processes, centrality to health and disease, and range of applications—in curricula at various levels will be important in motivating and preparing future generations to pursue careers in RNA biology and biotechnology. Exposure to chemistry, molecular and cell biology, and computer science (e.g., basic programming, bioinformatics) will be pivotal in ensuring that students develop the skills needed to emerge as leaders in the growing, increasingly multidisciplinary field of RNA modifications. Students will benefit from engaging with RNA science through hands-on, authentic research experiences and multidisciplinary modules that challenge them to consider open-ended, real-world problems.
The chemistry and biology of RNA modifications can be integrated into already existing undergraduate courses. For-credit or paid research opportunities through existing mechanisms can be made known and available to support students from a variety of socioeconomic backgrounds. Several opportunities and training programs that already exist through NSF, the National Institute of General Medical Sciences, and the Howard Hughes Medical Institute can be adapted or expanded,
in consultation with education and pedagogy experts and scientific societies with expertise in RNA, to emphasize RNA-focused research and career paths.
Individuals already in the workforce can make important contributions to advancing the study of RNA science and the application of epitranscriptomics, if they are offered professional development, retraining, and continuing education opportunities tailored to their needs. These opportunities can come either from in-house workforce development programs (in the case of large companies) or through community-based intermediary organizations, local colleges and universities, and professional organizations.
Conclusion 11: Greater emphasis on RNA science in undergraduate courses is needed to build a better infrastructure for embracing future generations in the workforce. In addition to further education, the existing and future workforce needs interdisciplinary training with strong quantitative and computational skills.
Conclusion 12: Educational efforts in the RNA modifications field need to (a) use methods that promote engagement, (b) reflect the interdisciplinary nature of the science in education and related workforce development efforts, (c) invest in reaching and engaging students and trainees from diverse backgrounds, and (d) scale up proven strategies for retaining trainees in piloted programs.
Recommendation 6: Institutes and funding agencies, such as the Howard Hughes Medical Institute, National Institute of General Medical Sciences, and National Science Foundation—in consultation or partnership with relevant education and pedagogy experts; scientific societies, such as the RNA Society, American Chemical Society, and American Society for Biochemistry and Molecular Biology; and industry groups, such as the Parenteral Drug Association and the International Society for Pharmaceutical Engineering—should build upon existing educational materials and training opportunities for high school, undergraduate, graduate, and postgraduate groups, and for the private sector. Such materials and opportunities should be tailored to fit the needs and interests of each group and should cover the basic biological, chemical, and biochemical principles of RNA modifications and the tools available for their study. All materials should incorporate engaging examples that demonstrate the importance of RNA and its modifications in fundamental science, health and medicine, food safety, the environment, and manufacturing.
With proper attention to education, training, recruitment, and retention, a well-trained, impassioned, diverse U.S. workforce with interdisciplinary expertise will be able to apply and advance sophisticated RNA biology, including epitranscriptomics, across the public and private sectors, academia, and industry (Figure S-4).
In addition to the novel RNA sequencing tools, technologies, computational methods; data infrastructure; and workforce benefits that would come from a large-scale initiative focused on epitranscriptomics, numerous other practical payoffs and broader impacts on society would result from such an initiative. The following lists some of the potential outcomes and associated broader impacts that are anticipated to arise from a 15-year investment in this space:
In its essence, the goal of all biomedical science is to relieve human suffering and maintain human health. Time, effort, and money invested in this epitranscriptome initiative relates to this goal directly and will provide information helpful to numerous enterprises, including those listed above. Without knowing the exact composition of all RNA molecules that derive from each gene, the ability of researchers to understand the underpinnings of health and disease is severely limited. Furthermore, if insufficient capabilities exist to directly sequence and study full-length RNA molecules and their modifications, then the ability to leverage that information for biotechnology applications suffers. All major sectors, including health and medicine, agriculture, energy, commerce, defense, and national security, stand to benefit from a better understanding of RNA modifications. More broadly, the associated activity across these numerous sectors may positively impact the bioeconomy. This report charts a future for sequencing RNA and its modifications, toward a new era of biology and medicine.
This page intentionally left blank.