Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop (2025)

Chapter: 6 Gaps and Opportunities in Applications, Funding, and Research

Previous Chapter: 5 Privacy, Ownership, and Accessibility Considerations in the United States
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

6

Gaps and Opportunities in Applications, Funding, and Research

The fifth session of the workshop highlighted metagenomics technology and opportunities in workforce and capacity development that could advance the capability to detect, address, and prevent disease outbreaks. Liliana Brown, office director for the Office of Genomics and Advanced Technologies at the NIH National Institute of Allgery and Infectious Diseases (NIAID), moderated the session. Charles Chiu, professor of laboratory medicine and infectious diseases and director of the clinical microbiology laboratory at University of California, San Francisco (UCSF), discussed applications of metagenomic next-generation sequencing (mNGS) in clinical diagnosis, detection of novel pathogens and variants, and outbreak investigation. Alli Black, senior epidemiologist at the Washington State Department of Health, outlined applied genomic epidemiology workforce gaps and efforts needed to address them. Heather Carleton, chief of the Enteric Diseases Laboratory Branch in the Division of Foodborne, Waterborne, and Environmental Diseases at the Centers for Disease Control and Prevention (CDC), explored reporting and capacity gaps in U.S. public health surveillance for enteric disease.

APPLICATIONS FOR NEW PATHOGEN DETECTION AND LIMITATIONS OF GENOMICS DATA

Chiu provided an overview of the use of mNGS for clinical diagnosis of infections and for public health pathogen detection and characterization. He outlined limitations and potential applications of genomic data.

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

Metagenomic Next-Generation Sequencing for Clinical Diagnosis

Health care providers are not able to diagnose the cause of a substantial proportion of infections in hospitalized patients, including meningitis, encephalitis, pneumonia, fever, sepsis, and hepatitis, said Chiu. He described mNGS as a “shotgun sequencing approach” that captures all DNA and RNA in a clinical sample of body fluid or tissue. Most of the sequencing is of the human host, but a small fraction of nonhuman reads presumably corresponds to bacteria, viruses, fungi, and parasites in the sample. A key advantage of this technique is its independence from the need for pathogen-specific primers or probes, differentiating it from polymerase chain reaction (PCR) and other targeted methods, said Chiu. Given that mNGS does not target any one pathogen, in principle this approach is able to identify any potential infection. Chiu explained that mNGS is well suited for identifying divergent viruses, including novel sequence-divergent viruses that PCR assays may be unable to detect.

Chiu outlined how teams at UCSF have developed clinical mNGS assays for cerebrospinal fluid, blood plasma, respiratory aspirations, and other body fluids (Gu et al., 2021; Miller et al., 2019; Wilson et al., 2019). Designated as “laboratory developed tests,” these assays are available for clinical reference testing and can be used for informing clinical management. They are not approved by the Food and Drug Administation (FDA) as in vitro diagnostic products, but FDA has designated the cerebrospinal fluid and the viral respiratory mNGS tests as breakthrough devices. Chiu and colleagues also developed a cloud-based bioinformatics pipeline compliant with Health Insurance Portability and Accountability Act (HIPAA) (Miller et al., 2019; Naccache et al., 2014).1 Combining clinical medical records with longitudinal mNGS testing over a seven-year period, Chiu was able to retrospectively determine the diagnostic yield and clinical utility of these tests. He and colleagues found that mNGS testing of cerebrospinal fluid performs better than any other diagnostic modality. Yet, given that the sensitivity of mNGS testing is 63 percent, he emphasized that the test should be used to complement—rather than replace—microbiology lab testing (Benoit et al., 2024). Chiu described mNGS testing as helpful in identifying potential pathogens in clinical samples that are undetectable by other testing methods, making mNGS a good fit for public health efforts to identify novel pathogens or new manifestations of known pathogens.

Current UCSF projects include developing methods of viral respiratory testing and automating as many as possible of the 300 steps involved in

___________________

1 The Health Insurance Portability and Accountability Act of 1996, Public Law 104-191, 104th Cong., 2nd sess. (August 21, 1996).

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

the cerebrospinal fluid mNGS test, Chiu highlighted. Noting that streamlined sequencing pipelines promote adoption, he emphasized that nanopore sequencing could be conducted in low-resource or point-of-care settings. Several software packages have been developed to facilitate metagenomic analysis (Chandrakumar et al., 2022; Gu et al., 2021; Kalantar et al., 2020; Lu et al., 2022). He underscored the importance of appropriate pathogen reference databases and of bioinformaticians in interpreting the clinical significance of organisms identified in complex microbiomes. This is particularly important for nonsterile sites (e.g., respiratory fluid, stool samples) that may contain organisms that are pathogenic, commensal, or even not previously associated with human infections, Chiu added.

Using these methods, Chiu and colleagues identified adeno-associated virus 2 (AAV2) as part of a polyviral infection that appears to be associated with severe acute pediatric hepatitis (Servellita et al., 2023). Additionally, mNGS was used in investigating an outbreak of Fusarium solani in U.S. patients with encephalitis, associating the outbreak with fungal meningitis from surgical procedures performed in Mexico (Smith et al., 2023, 2024). Whereas pathogen genomics uses whole genome sequencing (WGS) performed on positive bacterial or fungal cultures or clinical samples of viruses, Chiu highlighted that mNGS offers the advantage of not requiring the organism to be cultured and suggested that new phylogenetic analysis methods can be used in outbreak investigation.

Indirect Versus Direct Diagnosis of Infection

The clinical manifestations of infection from a pathogen are often unknown, particularly in the case of novel pathogens, said Chiu. For instance, researchers required several months after SARS-CoV-2 pathogen identification to detect a significant rate of asymptomatic transmission. Clinical mNGS enables leveraging RNA reads and development of machine learning prediction models to aid in predicting whether a patient has a bacterial, viral, fungal, or parasitic infection. For example, Chiu co-developed a model based on RNA sequencing of the host response to determine whether a patient has a parasitic infection or a fungal infection like histoplasmosis or coccidioidomycosis. Even without detecting the specific pathogen, this approach could potentially indicate that a patient has a Coccidioides infection via detection of a Coccidioides-specific host signature response. Similarly, this approach could potentially identify a host-response signature for Mycobacterium tuberculosis, an infection that is difficult to identify due to the absence or very low levels of the bacteria in many samples. Using metagenomic data already being used for pathogen detection, this approach repurposes RNA reads in developing host-response-based models, Chiu noted.

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

Limitations of Genomic Data

Chiu also outlined several limitations of genomic data, such as its general inability to yield functional information. For instance, genomic data can assist in predicting antimicrobial resistance (AMR) but tend to be inconsistent when used alone, underscoring the importance of host-response profiling for disease. Moreover, genomic data provide only indirect insights into pathogenicity and are less useful if annotated metadata are unavailable for clinical correlation of results, said Chiu. He added that genomic data are limited by the information available in reference databases, which may be incomplete and biased. Privacy and confidentiality considerations, such as HIPAA protections, also apply. Furthermore, standardization and interoperability in data generation and analysis are lacking, given that each laboratory has independent protocols and techniques, he stated. Currently, no assays for metagenomic sequencing have gained full FDA approval. Chiu continued that genomics remains too expensive, slow, and complex, and its role in clinical microbiology or public health remains unclear. Additionally, the role of different stakeholders in generating, analyzing, and maintaining data needs clarifying, as does how these efforts will be funded across public health, academia, and industry. Despite these limitations, Chiu maintained that mNGS is an agnostic approach that enables techniques such as host-response modeling, and he predicted that available data on the clinical impact of mNGS on patient care will expand as clinically validated tests receive FDA approval.

STRENGTHENING THE APPLIED GENOMIC EPIDEMIOLOGY WORKFORCE

Black discussed challenges and opportunities related to strengthening the applied genomic epidemiology workforce. She focused on immediate steps that can be taken to expand capacity, strategies to develop the next generation of the workforce, and lessons learned from past experiences that can guide work today.

Recruitment and Retention Challenges

Noting an increasing need for genomic epidemiology and advanced molecular detection (AMD) expertise within public health, Black highlighted a challenge in recruiting individuals who are skilled in using command-line-interface-based software and high performance computing infrastructure to investigate biological questions. People with this skillset are often trained at the doctoral level, a hiring pool from which public health agencies typically do not recruit epidemiologists, Black said. She asserted that successful recruitment efforts will require sufficient funding and clear emphasis on cre-

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

ativity and implementation science research inherent in this area of public health. To attract the needed workforce, recruiting efforts should highlight the opportunities to investigate novel questions and the academic challenges inherent in developing AMD within public health departments, said Black.

Pathogen genomics is pathogen-agnostic, but teams in public health departments are segregated by pathogen, Black pointed out, which creates a challenge for training. Furthermore, professionals who want to apply genomics have diverse backgrounds, ranging from laboratorians to bioinformaticians to epidemiologists.

She contended that to support retraining the current public health workforce, it is important to consider the breadth in training and experience as well as the limited time availability of working professionals. Moreover, given the variety of processes and expertise required to conduct AMD, training should build cross-team visibility and understanding, creating a cohesive shared knowledge base. The Pathogen Genomics Centers of Excellence (PGCoE) network is an avenue for such training. The PGCoE in Massachusetts has created a modular training and education pathway. This modularity enables courses to be tailored to fill gaps for professionals with diverse strengths and areas of expertise. Black remarked on the need to blend education and training, incorporating foundational theory with clear examples of how to use pathogen genomics in epidemiological work, to create content that is concrete, tangible, and relevant. In addition to recruitment and retraining needs, the public health field faces retention challenges, Black noted, as many professionals are moving to careers in academia or industry. Limited public health funding is a barrier to offering salaries competitive with those in software development and other technology industries. She described how successful recruitment of bioinformaticians, molecular epidemiologists, and other professionals with technical computational backgrounds typically involves offering high-level positions with salaries somewhat competitive with those in industry but lacking room for advancement. Black explained that public health job classifications in the state of Washington are determined by the state government, and establishing a new job class—such as bioinformatician or molecular epidemiologist—is a multiyear process. Nonetheless, retention efforts necessitate formalized pathways that offer career expansion and development, she stated. Public health offers unique and interesting problems to solve, so a forward-looking perspective supports the retention of creative researchers drawn to this field. Black added that retaining creative thinkers also entails pushing boundaries while maintaining standards of practice.

Strengthening Education in Genomic Epidemiology

Black outlined steps to integrate genomic epidemiology into public health and epidemiology curricula. Currently, genomic epidemiology is

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

absent from the degree pathways most epidemiologists follow: master of public health, master of science in epidemiology, or doctorate in epidemiology. Typically, postgraduate applied epidemiology fellowships have limited training on genomic epidemiology. Given that work in infectious disease epidemiology increasingly requires the ability to work with molecular data, she maintained that genomic epidemiology should be a foundational component of those programs, similar to current training on conducting observational research and epidemiological study design. Black highlighted that required courses in the University of Michigan Hospital and Molecular Epidemiology program include polymicrobial communities laboratory, molecular epidemiology, and hospital epidemiology, with electives that provide a greater understanding of infectious diseases and genomics. The learning objectives on the molecular epidemiology course syllabus specify hands-on experience in microbial genomic data analysis and in interpreting microbial genomic data to inform public health action. Black contended that this content should be moved from track-specific curriculum to public health core curriculum.

Broadening the Scope of Advanced Molecular Detection

The field of public health has historically conceptualized AMD work too narrowly, Black observed. Workforce development has focused on highly visible gaps in genomic surveillance fundamentals, such as the capacity to generate sequence data and access bioinformatics to assemble and analyze sequencing reads. She maintained that while these advancements are necessary, they are not sufficient to leverage sequence data for public health action. It is also important to focus on strengthening workforce capacity to interpret epidemiological dynamics from genomic data, evaluate uncertainty in those interpretations, and design genomic surveillance systems tailored to epidemiological questions. Black stated that although the field is able to generate genomics data, public health professionals often do not feel sufficiently comfortable interpreting the data to use it to inform concrete action. Thus, Black suggested that workforce development should extend beyond sequencing data generation, assembly, and analysis and focus also on making epidemiological interpretations from the data and strengthening how interpretations are communicated to public health partners such that the information can be used for response. Black added that AMD requires a vast scope of professionals across teams collaborating in a united fashion, underscoring the need to fund organizational support positions, such as grants and contracts managers, administrative assistants, and program managers.

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

GAPS AND OPPORTUNITIES FOR PUBLIC HEALTH INVESTMENTS IN ENTERIC DISEASES

Carleton explored opportunities to invest in isolate- and sample-based U.S. enteric disease surveillance in the United States. She described current surveillance procedures using the general example of a person with a severe diarrheal illness who seeks care from a clinician. The clinician will attempt to isolate the bacteria and identify the pathogen, she said. However, this process may take several days, thereby delaying diagnosis. Once a pathogen—such as Salmonella, Escherichia coli, Campylobacter, Cronobacter, or Shigella—is identified, the isolate is sent to a public health lab that performs WGS to detect serotype AMR, characterize the profile, and determine whether the isolate is part of an outbreak. The laboratory also sends WGS data to PulseNet in an effort to link clinical cases to a common source. Within PulseNet, pathogen WGS data are transmitted to a centralized CDC database. PulseNet data analysts monitor the database to find clusters of isolates sharing the same core genome multilocus sequence typing profile and determine whether they are associated with the same outbreak. Upon identifying a potential outbreak, CDC works with a national network of epidemiologists and with FDA and the USDA Food Safety and Inspection Service (FSIS) to determine the potential source, Carleton explained.

Opportunities to Improve Reporting in Public Health Disease Surveillance

Each year, approximately one in six Americans contracts a foodborne illness, resulting in 55 million annual cases, said Carleton. In contrast, approximately 60,000 isolates are submitted to PulseNet each year, resulting in detection of 400 potential outbreaks annually, added Carleton. Despite limitations, the current U.S. outbreak genomic surveillance system prevents approximately 250,000 foodborne illnesses and saves the economy $500 million per year by conservative estimates, emphasized Carleton. These benefits could expand with improved public health reporting and increased awareness, both in terms of clinicians knowing how and when to report and patients knowing when to report symptoms to a health care provider. Furthermore, not all clinical and diagnostic labs submit samples to public health in a timely manner, she noted. Carleton outlined that a common disease trajectory for Salmonella infection involves three to seven days to develop symptoms severe enough to warrant a health care visit, another three to five days to isolate the bacterial pathogen from the sample, and additional time for the isolated sample to be shipped to public health laboratories and processed for WGS. All told, this process requires two to four weeks after initial illness for a public health laboratory or epidemiologist to

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

link the case to a potential outbreak; at that point, patients may no longer remember the specific foods they had consumed a month before. Moreover, each state has its own rules and regulations, and some states do not require submission of all enteric bacteria samples to public health laboratories. Carleton described that the resulting “patchwork quilt” of reported pathogens leaves gaps in the ability to detect potential multistate outbreaks.

Carleton discussed reporting issues associated with Culture Independent Diagnostic Testing (CIDT), a method with increasing use rates due to its ability to identify the cause of illness within hours of receiving a sample. However, CIDT does not produce samples of DNA or RNA needed for WGS, limiting capacity to link cases and detect outbreaks. She added that many clinics do not send samples used for CIDT to public health laboratories, and in some cases, CIDT renders samples incompatible with culture. In states requiring diagnostic and clinical laboratories to forward samples, shipping and storage methods can result in samples that are no longer viable for testing upon arrival, Carleton stated.

Opportunities to Expand Public Health Surveillance Capacity

Capacity building constitutes an opportunity for public health investment, said Carleton. Diverse bacterial, viral, and parasitic pathogens cause foodborne illness, yet expertise is spread across various CDC and public health labs that focus on specific pathogen types. For example, there is no unified approach to diarrheal sample processing. She remarked on this opportunity to create more unified approaches within existing surveillance systems to research the full spectrum of pathogens causing illness. Moreover, the pathogen genomics disease surveillance system is not fully funded to sequence all isolates submitted to public health laboratories. Limited funding forces laboratories to consider delaying processing until the batch of bacterial isolates is of adequate size to achieve cost-effectiveness, a strategy that can limit or delay the detection of potential foodborne illness outbreaks. Furthermore, many public health departments lack the resources to interview all foodborne illness cases. Some interviews are delayed until the case is associated with a potential outbreak, resulting in interviews conducted weeks after initial food exposure. Carleton specified that approximately 10 percent of isolates submitted to PulseNet are part of an outbreak investigation and linked to a potential source, and PulseNet is unable to determine the source for the other 90 percent of isolates received. Some of these isolates are recurring, emerging, or persisting strains that feature genetic relatedness or features such as AMR genes. Carleton emphasized that these strains may require different control measures than those used for outbreaks, and investment could aid source identification efforts.

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

Carleton highlighted that generating strain-level subtyping information using metagenomics on a stool sample is challenging due to signal-to-noise levels. Most metagenomics approaches do not link serotype and AMR data to the pathogen, and the similarity between pathogen DNA and commensal bacterial DNA makes it difficult to distinguish them within a metagenomic shotgun sequencing sample. Further, public health infrastructure lacks capacity to support shotgun metagenomic sequencing and bioinformatics for food safety. She described how the public health disease surveillance system has used amplicon-based metagenomics approaches that target thousands of informative regions in stool DNA for a pathogen of interest, most often Salmonella. This method is compatible with isolate WGS data and generates a fingerprint used in determining whether a case is associated with an outbreak. She added that this approach can be designed for pathogens other than bacteria and for particular genes of interest such as those associated with AMR. Expanding the use of the method will require investment to build assays, she noted. Although shotgun metagenomics face a signal-to-noise issue, its pathogen-agnostic nature makes it a viable approach, Carleton stated. This method allows identification of known pathogens as well as emerging and unknown pathogens, although biological or bioinformatic methods to enrich DNA in stool samples may be necessary to generate the molecular subtype.

Given that one in four U.S. foodborne illness outbreaks are caused by an unknown agent, additional methods and tests are needed to identify novel pathogens and to detect gaps in existing methods, Carleton asserted (Scallan et al., 2011). Emphasizing that unresolved outbreaks generate a substantial cost to public health, she suggested investing in methods, capacity, epidemiology, and bioinformatics to leverage unknown outbreaks to identify new or underreported enteric pathogens. Carleton added that unified approaches for bacteria, viruses, and parasites—coupled with new approaches for strain-level subtyping of enteric bacteria from metagenomic samples—could help address the 90 percent of foodborne illness isolates that researchers cannot currently identify as part of an outbreak.

DISCUSSION

Metagenomic Sequencing Reporting and Data Sharing

A participant asked about methods of reporting pathogens of public health importance identified via mNGS and sharing genetic data. Chiu replied that UCSF submits any detected reportable organisms to both the San Francisco and California Departments of Public Health and shares microbial data, as these are not identifying and are not considered protected health information. He remarked that UCSF can obtain patient

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

consent for use of human genomic data in host-response analysis, but this requires institutional review board (IRB) approval and therefore is not an independent, streamlined process. Brown asked about developing system-wide capability to develop host-based diagnostics, given data-access issues. Chiu said he received IRB approval to study human genomic data in aggregate, noting that some host-based diagnostics can be developed with RNA sequencing data rather than raw transcriptomic data. He added that it is possible to deidentify data by converting raw sequence data to count tables for analysis.

Challenges and Opportunities in Strengthening the Genomic Epidemiology Workforce

David Blazes asked Black whether she has looked at workforce development efforts outside of the United States, such as the NGS Academy. Black remarked on restrictions limiting her international engagement but noted that the Washington State Department of Health Office of Communicable Disease Epidemiology conducts gaps analyses to understand barriers to using genomic data and has developed a landscape analysis toolkit that is available via the PGCoE network. Praising workforce development efforts by the NGS Academy and Public Health Alliance for Genomic Epidemiology (PHA4GE), Black expressed interest in unifying and consolidating efforts with such groups while noting structural challenges to international collaboration within certain government positions.

A participant remarked that leaders in genomic epidemiology often hold numerous responsibilities, such as running program units, writing grants, supervising employees, serving on working groups, and other obligations not directly related to technical development of tools or other solutions to address issues in pathogen genomics. Black replied that public health capacity is at an inflection point at which applied genomics expertise is increasingly distributed across the full epidemiology workforce. To meet this demand, she said, there are opportunities to improve on the numbers of graduate students in genomic epidemiology and the training they receive. Those in senior leadership roles are simultaneously called on to provide strategic vision and subject-matter expertise; the distribution of expertise to more junior staff would help balance workloads. Noting the importance of support systems for genomic surveillance studies, Black highlighted a collaborative pilot project to perform WGS on all Shigella cases within the county rather than limiting sequencing to outbreaks and priority specimens. This initiative involves interdependent efforts from the enteric microbiology lab, bioinformatics pipelines, data integration epidemiologists, and molecular epidemiologists. Black underscored the value that a program manager would add to this project by alleviating her of supervisory responsibilities.

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

Carleton stated that genomics epidemiology training needs in the laboratory setting include understanding how to conduct investigations and translate data for epidemiologists, and she highlighted the demand for professionals with expertise in data translation, bioinformatics, as well as knowledge on different pathogens and AMR. Echoing the value that program management would bring to pathogen surveillance projects, Carleton added that economic evaluation would enable better understanding of surveillance system effects, but current staffing does not allow for these evaluations.

A participant commented that bioinformatics capacity continues to be viewed as supplemental rather than fundamental within public health. Emphasizing the need for epidemiologists with some understanding of bioinformatics, he noted that bioinformatics capacity within epidemiology is distinct from that in the laboratory. Highlighting the interdisciplinary nature of AMD, Black underscored the need to break down silos, including the substantial laboratory–epidemiology divide in bioinformatics. Specifying that pipeline development and assembly require a skill set distinct from comparative genomic analysis and interpretation, Black described bioinformatics and genomic epidemiology as overlapping; she described her work to promote an approach in which public health laboratory bioinformaticians are increasingly involved in pipeline development, genomic assembly, and genomic data quality control. Blazes asserted that cost analyses could be beneficial to efforts to add bioinformatician staff positions. In the absence of an in-house bioinformatician, organizations must outsource these services at a cost. Combining this information with the cost of not detecting and investigating outbreaks as measured in dollars or human lives could help to persuade leadership, he added.

An attendee asked whether efforts to increase formal training in genomic epidemiology have engaged the Association for Professionals in Infection Control and Epidemiology or the Association of Schools and Programs of Public Health. She posited that lessons learned regarding the retention of clinicians in public health could be applied to retaining midcareer professionals with bioinformatics, artificial intelligence, and other valued skillsets in the public health field. Carleton stated that the development of advancement pathways is critical for retaining midcareer professionals; such pathways are currently lacking for bioinformaticians at CDC. Carleton added that the CDC Epidemic Intelligence Service does not yet have a genomics epidemiology or molecular epidemiology component in its applied epidemiology training program. The Association of Public Health Laboratories–CDC Fellowship Program offers a bioinformatics fellowship, but it recruits professionals rather than retaining midcareer employees. Black remarked that exchange programs between public health and academia could foster integration of new techniques and sustain interest in the field. Sabbatical laboratory experiences or postdoctoral training fellowships are rare, but designing more such programs could help to mitigate the academia–public

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

health divide. Additionally, Black noted that academic–public health partnerships could yield opportunities for epidemiologic analysis using genomic data (e.g., studying pathogens with notably changing epidemiology). Brown noted interest among mathematicians in collaborating on infectious disease computational mathematical models, which could be facilitated by integrating academia and public health.

Duncan MacCannell remarked on the challenges of building out the molecular epidemiology skill set to meet workforce needs, highlighting the value of PHA4GE training resources. Commenting on the divide between epidemiology and laboratory research, with AMD typically associated with the latter, MacCannell underscored that involving epidemiologists early in study design enables tailored approaches and outputs that translate to technologies. He asked how to use an integrated approach, given the siloed public health system and complex funding lines. Black replied that programs could be structured broadly to facilitate bidirectional data flow between epidemiology and the lab within an organization. For instance, Washington State Department of Health is unique in having a molecular epidemiology team organized among pathogen-specific disease programs in epidemiology; this structure could be replicated in other organizations, she stated.

Interpretation of Clinical Metagenomic Testing

Brown asked about the process for defining a diagnosis upon generating mNGS results. Chiu replied that he views mNGS results as somewhat subjective, similar to a pathology report from a slide or a radiologic report. The utility of the result relies on both generating and reporting the lab result and in providing interpretation, he maintained. To this end, he uses a hybrid approach in which he talks directly with the patient’s clinician—a step made possible by working within the UCSF system—to obtain clinical context. Over time, UCSF clinicians become more experienced with understanding the test results and the process of integrating them with other clinical lab testing and in patient management. Chiu added that the full clinical utility of mNGS testing will not be realized until multiple labs develop the capacity to process these tests. Thus, Chiu is working to make mNGS technologies more widely available.

Role of Ribosomal RNA in Genomics Sequencing

A participant remarked that a large portion of viral metagenomics data remains as ribosomal material—from either the host or bacteria—and that organism identification heavily based on ribosomal RNA is distinct from identification throughout the entire genome. He asked about training bioinformaticians on this distinction and about limitations to implementing

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

methods that offer rapid results. Noting the usefulness of ribosomal RNA, Chiu replied that the University of Washington provides 16S ribosomal RNA testing, using conserved primers to universally amplify bacteria and identify genus and species with hypervariable regions. He specified that the test is essentially a pan-bacterial test performed with sequencing, adding that fungi can similarly be identified by targeting fungal 18S or 28S ribosomal RNA regions or the internal transcribed spacer region. However, the high conservation of ribosomal RNA makes it difficult to identify anything other than RNA viruses using metagenomics. Thus, Chiu stated that he typically removes ribosomal RNA from RNA preps to increase the method’s speed and effectiveness. Chiu remarked that there are commercial products (e.g., FastSelect RNA library kits) that can remove 99 percent of ribosomal RNA in 15 minutes; he routinely uses this resource for transcriptome RNA sequencing libraries or identifying RNA viruses.

RNA Sequencing Applications

An attendee questioned whether metagenomic assays could use RNA sequencing to determine host factors that inform both diagnosis and prognosis. Additionally, he asked about opportunities to personalize therapeutic approaches by tailoring antiviral and antibacterial treatments according to host factors. Chiu replied that the ability of RNA sequencing to assay the gene expression profile for a given individual at a given point in time may offer utility in longitudinal host expression; therefore, it could potentially yield prognostic data. For instance, some data suggest that RNA sequencing could aid in identifying patients at risk of more severe COVID-19 complications, such as severe pneumonia. He and colleagues are using the transcriptome profiling approach to address chronic illnesses such as long COVID and persistent symptoms associated with Lyme disease to determine whether RNA sequencing could play a role in investigating chronic illnesses, rather than simply diagnosing acute illnesses. Chiu noted potential for the approach to personalize diagnosis and treatment. For example, some patients with acute cryptococcal infections appear to have a host response suggestive of an acute fungal infection, which resolves after the infection is treated. These methods could aid in identifying patients not responding to long courses of conventional therapy.

Host Metadata Considerations

Given that PulseNet intersperses animal isolates with human data and applies the clinical label to both, a participant asked about the ability to label the host species. Carleton replied that CDC is working to address this via collaboration with the GenomeTrakr network and PHA4GE to improve metadata. CDC is continuing to evaluate methods to increase the utility

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.

of released metadata—without infringing on privacy laws or protections of personal identifiable information—by working to provide additional context, including host species, said Carleton.

Potential Value of Nonhuman Virus Research

A participant stated that metagenomic studies on arboviruses in arthropods have identified insect-specific viruses that are not pathogenic to humans and asked about the potential to control pathogenic viruses such as dengue, chikungunya, and Zika through interference of virus replication in arthropod vectors. Chiu highlighted the potential value of examining how nonhuman viruses interact with viruses known to cause infections in humans. Researchers have focused on pathogens known to cause disease, said Chiu, but interactions among various viruses from different families are known to occur. He added that the families known to yield pathogenic viruses—such as Flaviviridae, Bunyaviridae, and Togaviridae—are studied more than other families. However, interactions between viral families typically considered insect-specific likely occur and may play a role in the transmission of pathogenic viral families, thereby meriting further study, Chiu maintained.

Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 63
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 64
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 65
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 66
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 67
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 68
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 69
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 70
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 71
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 72
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 73
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 74
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 75
Suggested Citation: "6 Gaps and Opportunities in Applications, Funding, and Research." National Academies of Sciences, Engineering, and Medicine. 2025. Accelerating the Use of Pathogen Genomics and Metagenomics in Public Health: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/29103.
Page 76
Next Chapter: 7 Envisioning the Future of Pathogen Genomics
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.