The tools and techniques being developed under the large umbrella of automated research workflows (ARWs) promise to collapse the centuries-old serial method of research investigation into processes where thousands or even millions of simulations or experiments are iterated rapidly in closed loops, with the analysis of data and even the design of experiments or controlled observations being assisted by machine learning (ML) or optimization techniques. Simultaneously, ARWs provide a way to satisfy pressing demands across fields to increase interoperability, reproducibility, replicability, and trustworthiness by better tracking results, recording data, establishing provenance, and creating more consistent metadata than even the most dedicated researchers can provide themselves. The committee’s exploration of ARWs illustrates that the research enterprise stands at an important inflection point. The scientific revolution of the 17th century ushered in an unprecedented era of human progress, leading directly to discoveries and innovations that have transformed tasks requiring the application of muscle or simple technologies into services performed by ever more effective machines. The research enterprise will need to develop new approaches and tools as it enters an era in which core elements of knowledge discovery itself can be automated and accelerated.
In important ways, this emerging process of innovation and adaptation represents a continuation of the long-standing trend of computational power being harnessed to perform a variety of research tasks. Yet new twists will need to be considered and addressed. Concerns about privacy, ethics, and trust arising in many domains of human activity become even more relevant to the entire research enterprise as we increase use of artificial intelligence (AI)-based technologies.
As illustrated in the use cases examined in Chapter 3, different disciplines of research have very different usage patterns relative to ARWs—in terms of specific
tools and platforms and, more generally, propensity to incorporate workflows into their processes in the first place. Costs for equipment, software, staffing, and training may vary by discipline, but the broad need for domain researchers to incorporate new methods and approaches holds across the use cases. In addition, additional specialized expertise in areas such as software engineering, algorithm development, and data science will be required in a number of fields.
Further, several lines of thought that emerged from the March 2020 workshop are germane not just to the task at hand, but more broadly across the scientific enterprise. These themes include the need to break down academic silos, provide incentives for greater collaboration among researchers, ensure greater interoperability across technologies, foster sharing of a broader range of research outputs, and address issues such as striking an appropriate balance between access to and protection of data.
The committee’s findings and recommendations point to promising areas of focus for the research enterprise in facilitating the effective implementation of ARWs. The use cases and supporting literature described in Chapter 3 support all of the recommendations, with Findings A and B and Recommendation 1 in particular flowing directly from examples drawn from a variety of domains. Finding C and Recommendations 2, 3, and 4 are supported in Chapters 4 and 5, which draw on presentations from the March 2020 workshop and other cited literature. Finding C and Recommendation 5 are also supported mainly in Chapter 5, again, with points drawn from the use cases.
In many disciplines, the emergence of automated research workflows (ARWs), built upon contemporary cyberinfrastructure, is demonstrating the potential to vastly increase the speed and efficiency of a range of research activities. These include designing and conducting experiments, analyzing data, and observing natural phenomena. These improvements can be realized at scale by implementing infrastructure and practices that facilitate the application of artificial intelligence and machine learning and related technologies to research. Realizing the potential of ARWs could accelerate the pace of scientific discovery by orders of magnitude and thereby expand the research enterprise’s contribution to society.
In addition to increasing the speed and efficiency of research, the effective development and implementation of the technical and human infrastructure for automated research workflows (ARWs) will contribute to strengthening the research process in other ways. For example, the greater transparency and
repeatability made possible by automating and capturing specific steps in the research process—advances that underlie the development of ARWs—can foster reproducibility, replicability, and responsibility in research. Adoption of common and interoperable tools and platforms—which could be accelerated by the advance of ARWs but depends on other developments as well—can facilitate international and interdisciplinary research collaboration. Broader access to research workflows and results and the enhanced ability to uncover and correct errors can contribute to greater confidence in research findings and the research enterprise and reduce redundancy among research efforts. To be sure, issues such as dealing with large amounts of streaming data and complex computational approaches will continue to pose technical challenges to the design and implementation of ARWs. In addition, incorporating emerging principles and guidelines for responsible artificial intelligence and machine learning advocated by various organizations, such as building in human review of algorithms, uncovering and addressing bias, and supporting transparency and reproducibility, will also help to secure the benefits of ARWs.
Organizations that fund, perform, and disseminate research, along with scientific societies, should support and enable automated research workflows (ARWs) that embody the following design principles:
Realizing the potential of automated research workflows (ARWs) will require modification of the research enterprise, including sustainable funding for the necessary hardware, software, and human resources, educating the scientific workforce, reporting and sharing research results, and structuring researcher rewards and incentives. Multidisciplinary, multirole collaboration is essential to realize the potential of ARWs.
Research funders, working with other stakeholders such as societies, research institutions, and publishers, should place greater priority on approaches to ensuring the creation and sustainability of key systems, tools, platforms, and data archives for automated research workflows (ARWs). Priorities include
Research funders, higher education, research institutions, and scientific and professional societies should support the development and implementation of educational programs and career pathways aimed at building the workforce needed to develop and utilize automated research workflows (ARWs), including the creation of career tracks that support ARW capabilities. Examples of what is needed include
Research funders, research institutions, and disciplines should work to create an automated research workflow (ARW)-friendly culture by making changes in incentive and reward structures aimed at encouraging behaviors that are central to realizing the potential of ARWs. These include
In addition to barriers to progress that exist within the research process itself, there are legal and policy issues that affect implementation of automated research workflows in specific domains that will require international multistakeholder efforts to address.
Research enterprise funders, performers, publishers, and beneficiaries should work with governments, data privacy experts, and other entities to address the legal, policy, and associated technical barriers to implementing automated research workflows in use-inspired applications in specific domains and explore solutions to make the outputs available through privacy-preserving algorithms, federated learning approaches to using data, and other methods.
This page intentionally left blank.