Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief (2024)

Chapter: Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief

Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.
images Proceedings of a Workshop—in Brief

Accelerating and Deepening Approaches to FAIR Data Sharing

Proceedings of a Workshop—in Brief


On April 20, 2023, the Board on Research Data and Information (BRDI) of the National Academies of Sciences, Engineering, and Medicine (the National Academies) convened a workshop to bring together stakeholders to explore new initiatives to support FAIR (findable-accessible-interoperable-reusable) data sharing,1 as well as the need for innovative approaches, potential obstacles to success, and how obstacles might be overcome. Participants included researchers and representatives from institutions, federal agencies, private funders, and professional societies.2

OPENING REMARKS AND AGENDA SETTING

BRDI chair Sarah Nusser (Iowa State University) welcomed participants and noted the progress made over the past 10 years to promote open science. Memos from the White House Office of Science and Technology Policy (OSTP), along with responses to that guidance from federal agencies, have inspired researchers and institutions to change how they view their research processes. Nonprofits, philanthropies, professional societies, and others have also become involved. She welcomed the focus of the workshop on researchers, noting that many are interested in adopting FAIR principles but are stymied in how to do so due to the limitations on accessibility and reusability of materials and challenges relating to research integrity.

AFTER THE NELSON MEMO: NEW PRIORITIES AND OPPORTUNITIES FOR RESEARCH DATA SHARING

Federal agencies are now crafting policies and practices to respond to the Nelson memo, released by OSTP in August 2022 to promote open science.3 Three cochairs of the National Science and Technology Council’s Subcommittee on Open Science (SOS) spoke about ongoing agency coordination, as well as what their own agencies are doing.

Jerry Sheehan (National Library of Medicine/National Institutes of Health [NIH]) reviewed the main provisions of the memo, which states that “publications and their supporting data resulting from federally funded research (are made) publicly accessible without an embargo on their free and public release.”4 NIH is addressing the areas covered by the memo related to data, persistent identifiers (PIDs), and publications and metadata in different ways. A

__________________

1 For the FAIR principles that set out requirements for the sharing of scientific data, see https://www.go-fair.org/fair-principles.

2 For the full agenda, speaker biographies, presentations, and other background, see https://www.nationalacademies.org/event/04-20-2023/accelerating-and-deepening-approaches-to-fair-data-sharing-a-workshop.

3 This memo, “Ensuring Free, Immediate, and Equitable Access to Federally Funded Research,” is known as the Nelson memo because it was issued by Dr. Alondra Nelson. For the full text of the memo, see https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf.

4 OSTP. 2022. Ensuring Free, Immediate, and Equitable Access to Federally Funded Research.

Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.

Data Management policy went into effect in January 2023.5 NIH has issued general guidance for comment on a policy related to metadata/PIDs, which has a longer timeline for compliance. Most current work is on access to publications, building on PubMed Central and the NIH Public Access policy. NIH is expanding a Preprint pilot, which was initially set up for preprints related to COVID-19. More than 7,700 preprints are available through PubMed Central,6 with 250 to 280 new records added each week, or about 10 times the number of records received weekly than in the COVID-centered Phase 1 of the pilot. NIH will determine how often the shared preprints result in publications.

NASA has submitted its updated draft public access plan to OSTP and is awaiting comment, according to Patricia Knezek (NASA). In 2022, NASA’s Science Mission Directorate launched the Open-source Initiative and Transform to Open Science (TOPS) mission and declared 2023 as the Year of Open Science.7 NASA brought the “year” concept to SOS and many other agencies have joined in with activities and events. NASA, especially the Science Mission directorate, is looking at open science as a shift in the way science is done through more transparency, inclusiveness, accessibility, and reproducibility.8 She stressed the importance of understanding the impact on scientists so they abide by the law but the burden is minimized.

Alan Tomkins (NSF) reported on what NSF has heard about open science through engagements with many communities. There is commonality in principles, with broad agreement about the critical role of publishing, value of cross disciplinary work, and role for new ecosystems to share research. Equity is a driving interest. For example, there is concern that over-reliance on gold open access9 may disadvantage researchers, especially researchers from disadvantaged and marginalized communities and early career researchers who may have limited resources for associated costs. NSF is committed to experimentation to make implementation evidence-based, he said, and it will take time to get that evidence. Since 2021, NSF has spent more than $50 million in this area, including a recent Major Research Instrumentation grant10 to the University of Michigan to develop an open science platform.

DISCUSSION

Nusser pointed to the Nelson memo’s expansive view of data sharing and changes in publication embargos and asked how to help researchers with these aspects. Tomkins said OSTP is reviewing the NSF response but suggested that the author-accepted manuscript (the post-peer review, pre-publication version) may meet the memo’s goal for getting information out. He predicted a change in publication practices within 5 to 10 years, saying that not every publication should be gold open access with associated publication fees. NSF is also working with existing and new data repositories to address current differences in data sharing practices among scientific disciplines.

In response to a question about how the new policies will help the U.S. increase its engagement with international partners, Knezek pointed to the Science Mission Directorate’s Scientific Information Policy (SPD-41-A), released in December 2022,11 which lays out expectations for international partners relating to open sharing of scientific information. Sheehan commented that open science has been prominently featured in discussions with G7 and G20 countries and science ministers.

REDESIGNING RESEARCH PROCESSES TO CAPITALIZE ON FAIR

Alexa McCray (Harvard University) was a former BRDI chair and chaired the National Academies consensus study Open Science by Design, which offers a set of principles and practices that fosters openness through the research life cycle.12 She moderated a panel to discuss the research processes needed to capitalize on FAIR.

From his perspective of five decades of experience in cyberinfrastructure, Daniel Atkins (University of Mich-

__________________

5 For more information, see https://sharing.nih.gov/data-management-and-sharing-policy.

6 For more information, see https://www.ncbi.nlm.nih.gov/pmc/about/nihpreprints.

7 To read the policy, see https://science.nasa.gov/science-red/s3fs-public/atoms/files/SMD-information-policy-SPD-41a.pdf.

8 For more information, see https://open.science.gov.

9 Gold open access means that the final published version of an article is permanently and freely available, and usually involves payment of an article processing charge by the author.

10 The Research Data Ecosystem (RDE), a National Resource for Reproducible, Robust, and Transparent Social Science Research in the 21st Century, https://www.nsf.gov/awardsearch/showAward?AWD_ID=1946932&HistoricalAwards=false.

11 See https://science.nasa.gov/researchers/open-science/science-information-policy.

12 National Academies for Science, Engineering, and Medicine (NASEM). 2018. Open Science by Design: Realizing a Vision for 21st Century Research. Washington, DC: National Academies Press. https://doi.org/10.17226/25116.

Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.

igan) recalled a 2003 study he chaired for NSF on cyber infrastructure and lauded the progress in changing research processes since then with advances in artificial intelligence (AI) and the FAIR guidelines.13 As the benefits of AI-supported science become more apparent and groups using it gain competitive advantage over those who do not, the research community will have a greater incentive to adopt FAIR data principles through engagement in a virtuous cycle, he predicted.

Atkins chaired the recent National Academies study Automated Research Workflows for Accelerated Discovery.14 He called attention to several projects highlighted in the study in which automated workflows advanced science, strengthened rigor and reproducibility, and facilitated collaboration. He noted the report covers social, behavioral, and ethical requirements beyond the technology. Reflecting on the past, he said progress is typically nonlinear; he recounted a case where teams initially resisted global collaboration until they realized that it produced more and better science. He also called for domain scientists and those expert in AI methods to work together and noted investments by Schmidt Futures to support training of the next generation of researchers in the creation and adoption of automated workflows.

Carl Kesselman (University of Southern California) acknowledged recent top-down mandates to promote open science and offered a bottom-up perspective “from the trenches.” Rather than think of FAIR data as the goal at the end of a project, he urged integrating FAIR principles as part of the scientific method done every day, which he and collaborator Ian Foster have termed “continuous FAIRness.” To do this, people need the right training, teams, and tools, he said. He provided two examples: FaceBase,15 supported by the National Institute of Dental and Craniofacial Research, which was initially resisted by researchers but now provides data curation, education, and outreach for craniofacial researchers worldwide, and another study on memory formation in larval zebrafish.16 In the latter study, every piece of data was assigned a digital object identifier at the microscope level. “Don’t make it painful, this is what it means to do good science,” he urged.

DISCUSSION

A participant asked if the curation tools mentioned by Kesselman can be generalized for other disciplines. Kesselman said both his examples used similar platform and tools, and the key is to be explicit about the data model used and how the data model and the tool can be adapted to different domains. As an analogy, he noted the company SAP has a general set of tools that are customized for a given environment.17 Training must occur at different levels, such as embedding a trainer in a lab or blending teams. He expressed frustration with scientists “who are willing to devote six months to learn how to use a 10X sequencer but not take the time to learn to use a Python notebook.” Atkins noted that a Schmidt Futures program will produce 100 post-docs per year with this knowledge, and is also supporting short courses and other generic results.

A participant concurred about the need for FAIR-embedded research but commented that not all instruments can produce metadata such as the microscope mentioned by Kesselman. Kesselman agreed it is hard to obtain standardization across vendors and instruments. “We have to do the best we can, as early as we can,” he said. With some coordination, the science community may have leverage with vendors, noting examples in earth science and in radiology.

STAKEHOLDER PERSPECTIVES: THE CURRENT STATE OF RESEARCH DATA SHARING AND FAIR

Workshop committee member Ramanathan Guha (Google) moderated a session that brought in the perspectives of three stakeholders and discussed Google’s Data Commons as one model for sharing data in usable ways.

Internet pioneer Vint Cerf (Google) noted his engagement with open science is informed by work with Francine Berman, who led U.S. participation in the Research Data Alliance for NSF. Key parameters to cope with include how to make both formatted and unformatted

__________________

13 Atkins, D. E. et al. 2003. Revolutionizing Science and Engineering through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure. https://www.nsf.gov/cise/sci/reports/CH1.pdf.

14 NASEM. 2022. Automated Research Workflows for Accelerated Discovery: Closing the Knowledge Discovery Loop. Washington, DC: National Academies Press. https://doi.org/10.17226/26532.

15 For more information, see https://www.facebase.org.

16 Dempsey et al. 2022. Regional synapse gain and loss accompany memory formation in larval zebrafish. Proceedings of the National Academy of Sciences 119(3):e2107661119. https://doi.org/10.1073/pnas.2107661119.

17 For more information, see https://www.sap.com.

Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.

data persistently useful, he said. Google indexing and large language models (LLMs) are being developed; LLMs are not yet entirely reliable to capture unformatted data, but he predicted they will be in the future.

More generally, he stressed the need for metadata in order to understand the provenance of datasets and suggested schema.org (a collaborative, community activity for structured data) as a resource.

Long-term preservation of data is a concern as formats change. Standards are only good for two generations, he said. He envisioned an archaeologist two thousand years from now digging up some 21st century data: “What if on the front end, there is a description of how to build a machine to read the data?” He echoed Kesselman about the value of tools that generate identifiers automatically and of making data available to others before publication, perhaps initially with access control set by the researcher. He raised the question of how to pay for data collection over the long term and how to build it into the economics of scientific research. He noted indexing is important to facilitate discovery. There are many different ways, such as the Data Commons Model at Google (see below). Figuring out solutions “will benefit ourselves and future generations because the information preserved can be key to amazing discoveries,” he concluded.

David J. Hayes (Stanford Law School) discussed FAIR data sharing in the context of climate change. The collection of macroeconomic data has been the primary approach, but activity-level and verified climate data are needed to guide corporate and governmental policy and funding decisions. The need for an effective FAIR data-sharing mechanism is acute as significant new climate-related data will be generated in the next few years, he stressed.

The Environmental Protection Agency’s National Greenhouse Gas Inventory is updated annually for sectors for which data is required by law. However, in unregulated greenhouse gas (GHG) sectors (e.g., agriculture, forestry and other land-based emissions sources and, until recently, methane sources), the data are poor and have large uncertainty bands. Several of these areas are receiving public and private investments in GHG data collection for a variety of purposes to address data sharing needs in the climate context.

He noted a recent National Academies report on GHG data needs called for a combination of activity-based and atmospheric-based approaches and to align GHG emission information with six pillars that are essential FAIR principles: “Usability and timeliness, information transparency, evaluation and validation, completeness, inclusivity, and communication.”18 He also called attention to a draft federal strategy to advance GHG measurement and monitoring19 that calls for an integrated system of data collection and a coordinated approach. Recent legislation makes significant resources available to fight climate change, but better and more accessible data are needed to support the funding. Stanford University produced a report on data requirements for climate-smart agriculture and will do a similar study focused on forestry.20

Erik Schultes (GO FAIR) discussed developments in Europe related to implementation of the FAIR principles and offered the concept of data visiting to contrast with data sharing. According to Schultes, the 2016 publication of the FAIR principles did not cover implementation, though the sub-principles get us some way. Above all, the article stressed the importance of machine-action-ability and automation as a key objective of the FAIR principles.21 Automation (machine-actionable metadata), technical infrastructure, and domain-specific choices are addressed throughout. To help stakeholders, GO FAIR boiled down the principles into a three-step “how to” called the Three-Point FAIRification Framework.22 The framework’s first step is for a community of practice to consider domain-relevant metadata requirements

__________________

18 NASEM. 2022. Greenhouse Gas Emissions Information for Decision Making: A Framework Going Forward. Washington, DC: The National Academies Press. https://doi.org/10.17226/26641.

19 NASA. 2023. Request for Information: Draft Federal Strategy to Advance an Integrated U.S. Greenhouse Gas Monitoring and Information System; Number NNH23ZDA009L. Federal Register 2023-04328. https://www.federalregister.gov/documents/2023/03/02/2023-04328/request-for-in-formation-draft-federal-strategy-to-advance-an-integrated-us-greenhouse-gas-monitoring. Additional draft paper on agriculture and forestry will follow.

20 Hayes, D. J. et al. 2023. Data Progress Needed for Climate-Smart Agriculture. Stanford Law & Policy Lab. https://law.stanford.edu/publications/data-progress-needed-for-climate-smart-agriculture. The aforementioned report on climate-smart forestry was published in July after the workshop. See https://law.stanford.edu/publications/measuring-the-carbon-and-other-benefits-of-climate-smart-forestry-practices.

21 These can be found at https://www.go-fair.org/fair-principles.

22 For the framework, see https://www.go-fair.org/how-to-go-fair.

Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.

and, with metadata experts perhaps in Metadata for Machine Workshops, develop machine-actionable metadata components. The second step is to develop a FAIR Implementation Profile (FIP), which can be developed in a questionnaire environment. The third step is to develop a FAIR Data Point to scope requirements for repositories and other uses.

Schultes offered data visiting as a way to implement the FAIR principles for data that must be controlled. In contrast with data sharing, data visiting allows data to be reused, in precisely defined circumstances, but not copied or transferred. Offering several examples, he pointed out the analysis occurs locally, and access to the data is controlled by the owner.

Guha described the Google’s Data Commons as an example of a way to download, clean, normalize, and join data through natural language.23 Providing several examples (such as which California counties are most at food risk from climate change or which counties across the nation are most at health risk), Guha showed how Data Commons can provide knowledge graphs through an open source, cloud-based infrastructure. Other examples he shared included the Feeding America Data Commons, Sustainably Data Commons, India Data Commons, and Biomedical Data Commons. AI will further advance the potential of these commons, he said.

DISCUSSION

Nusser noted the privacy challenges in creating data commons for agriculture and other areas with a lot of proprietary data. Hayes offered one solution is the Farmers Business Network, launched as a startup and now involving about thousands of farmers and on millions of acres of farmland in the United States, Canada, and Brazil.24 Anonymized data are aggregated, and participating farmers benefit from the information. He acknowledged anxiety about privacy but said there are proven techniques to anonymize data. Cerf added that the Data Commons presented by Guha uses a system to meld private and public data. He also called attention to Google’s Confidential Computing initiative for data that people recognize as valuable to use in collaborations but do not want to share.25 Schultes referred to data visiting as a complementary approach.

RECENT INITIATIVES AND NEW APPROACHES TO PROMOTING FAIR RESEARCH PRACTICES

While NSF has long been interested in FAIR and open science, Findable Accessible Interoperable Reusable Open Science Research Coordination Networks (FAIROS RCN) was the first NSF solicitation explicitly aimed at advancing these standards and practices, according to program director Martin Halbert (NSF). In 2022, the first year of the solicitation, NSF funded 10 projects with 28 interlinked awards.26 Three recipients described their RCNs.

Matthew Mayernik (National Center for Atmospheric Research [NCAR]) spoke about persistent identifiers in the context of FAIR. NCAR is a Federally Funded Research and Development Center managed as a nonprofit by the University Corporation for Atmospheric Research. With more than 120 member colleges and universities, UCAR conducts research provides facilities and community coordination for the community. This talk is informed by NCAR’s new NSF-supported Research Coordination Network (RCN) project focused on PIDs for facilities and instruments.27 NCAR has assigned DOIs (one type of PID) for several years and is using PIDs to expand track outcomes of different types of scientific facilities. Another motivation for the RCN is to better coordinate PID use across communities.

More generally, Mayernik explained PIDs fulfill several FAIR principles, are both persistent and actionable,28 and are key to achieving networked science goals. Many types of PIDs have been developed, such as DOIs for objects and ORCID for individual researcher profiles, and the different types for both general and specific purposes are increasing. He urged keeping researchers front and center, so they, and not just service providers, benefit from using PIDs. Offering several use cases developed by Dan Katz and the Research Data Alliance/FORCE11, Mayernik said these benefits include direct access to artifacts,

__________________

23 For more information, see https://www.datacommons.org.

24 See https://www.fbn.com/about.

25 For more information on Confidential Computing, see https://cloud.google.com/confidential-computing.

26 For more information on the solicitation, see https://new.nsf.gov/funding/opportunities/findable-accessible-interoperable-reusable-open.

27 For more information, see https://ncar.github.io/FAIR-Facilities-Instruments.

28 McMurry, J. A. et al. 2017. Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data. PLoS biology 15(6): e2001414. https://doi.org/10.1371/journal.pbio.2001414.

Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.

archiving so artifacts are not lost, references to identify them, descriptions, and credit to authors and contributors.29 Mayernik concluded that PIDs alone do not provide value, but the value comes from using them. Inconsistent use of PIDs is a limiting factor and can be actively deceiving, such as through undercounts of data citations or collaboration networks based on incomplete ORCID profiles.30 He also stressed that PIDs require management and maintenance.

Christine Kirkpatrick (San Diego Supercomputer Center) introduced the FARR (FAIR in ML, AI Readiness, and Reproducibility) RCN.31 She noted a motivation behind the three themes in the RCN name stemmed from the large amount of time that researchers spend in “data wrangling” and not analysis. The RCN seeks to answer how FAIR principles can increase the efficiency of people and machines, what role repositories can play in AI readiness, and which community practices can enhance AI reproducibility.

A literature review of AI readiness revealed the term is used in many ways and at different levels of granularity, Kirkpatrick said. She called attention to two frameworks related to AI readiness for business.32 One takeaway, she said, is the importance not just of technical readiness, but also of cultural, regulatory, and other dimensions. Related to AI reproducibility, she pointed to an experiment by a colleague that showed that the same data and applications but run on different graphic processing units (GPUs), produced different AI results. While these variances may be acceptable for some purposes, they are not in such critical areas as brain imagery, she pointed out. FARR is increasing awareness about the need to quantify and understand these variances.

FARR brings together people who are not already meeting to find community practices around AI and ML that help with efficiency and reproducibility—and encourages research to fill gaps. “Early signals” from the project, she reported, are that AI reproducibility is complex and resource intensive. Documentation must be prioritized and papers older than three years are difficult to reproduce, which suggests shifting priorities for curation teams. The definition of AI readiness is evolving within communities, and a listening tour has shown that people who want to use new technologies need on-ramps or they will be left behind.

A third NSF-supported FAIROS RCN was explained by Ian Foster (Argonne National Lab). MaRCN (Materials FAIROS RCN) builds on ongoing efforts by the Materials Research Alliance to aggregate materials data, which is very diverse and often proprietary.33 MaRCN connects critical stakeholders, convenes to develop metadata standards and shared tooling to enable FAIR materials research, catalyzes through training and engagement, and communicates the benefits of FAIR materials research. An important aspect is to document use cases in which FAIR research led to new or faster discoveries.

As a novel initiative, a LLM Virtual Hackathon in Materials and Chemistry was held in March 2023, involving 13 teams. Highlights were participation by diverse researchers, high visibility for students, and the formation of new teams. The hackathon has logged 1.2 million unique views on social media to date, the equivalent of 30,000 seminars of 40 people each, he pointed out. He noted that social and asynchronous media are huge multipliers and can connect people from around the world. Although the ideas generated in the hackathon must be further developed, the projects showed that LLMs have the promise to affect and extend the definition of FAIR.

DISCUSSION

Halbert explained the FAIROS RCNs hold monthly calls to share observations and practices. As an example of cross-fertilization, Kirkpatrick said she is looking at the NCAR work on PIDs for facilities and instruments. She also underscored the benefit of centering science-of-science research as its own portfolio, rather than tucked into other projects. A participant said the cohort has shown his RCN new methodologies for engaging with communities. Halbert added that several RCNs are focusing on reproducibility.

__________________

29 Research Data Alliance/FORCE 11. 2020. Software Source Code Identification: Use cases and identifier schemes of persistent software source code identification. https://zenodo.org/record/4312464.

30 See Dr. Mayernik’s presentation at: https://www.nationalacademies.org/event/04-20-2023/accelerating-and-deepening-approaches-to-fair-data-sharing-a-workshop.

31 For more information, see Farr-rcn.org.

32 The two AI-ready frameworks referred to by Kirkpatrick include: Holmstrom, J. 2022. From AI to digital transformation: The AI readiness framework. Business Horizons 65(3):329-339; and Najdawi, A. 2020. Assessing AI readiness across organizations: The case of UAE. doi: 10.1109/ICCCNT49239.2020.9225386.

33 For more information, see https://marda-alliance.org.

Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.

ORCID as a PID for researchers came up several times during the workshop.34 One challenge is that the U.S. research ecosystem is very decentralized and coordination of research data is hard, but international standards and practices will help in the United States, a participant suggested.

The CARE Principles for Indigenous Data Governance as a complement to FAIR were raised as a way to ensure that research engaged with indigenous communities is conducted appropriately and respectfully. Several participants shared their understanding of the principles and stressed the need for transparency and clear approaches when working with indigenous data and communities.35

CONCLUDING THOUGHTS AND NEXT STEPS

In closing, Nusser reflected on the value of FAIR principles throughout all aspects of the research process for those who produce and those who reuse data. She called for more work to make research FAIR at all points in the process and underscored the need for implementation and training that is meaningful to researchers and is tailored to different domains. Related to reusability, she lauded efforts to make data at least partially FAIR through models to address issues like privacy and proprietary interests. She noted BRDI will draw on the issues raised in the workshop as it plans its work in the coming year.

__________________

34 For more information on ORCID, see https://orcid.org.

35 For the CARE Principles, see https://www.gida-global.org/care.

DISCLAIMER This Proceedings of a Workshop—in Brief was prepared by Paula Whitacre as a factual summary of what occurred at the workshop. The planning committee’s role was limited to planning the workshop. The statements made are those of the rapporteur or individual workshop participants and do not necessarily represent the views of all workshop participants; the planning committee; or the National Academies of Sciences, Engineering, and Medicine.

REVIEWERS To ensure that it meets institutional standards for quality and objectivity, this Proceedings of a Workshop—in Brief was reviewed in draft form by Shreyas Cholia, Lawrence Berkeley National Laboratory and Simon Hodson, Committee on Data of the International Science Council. The review comments and draft manuscript remain confidential to protect the integrity of the process.

PLANNING COMMITTEE Sarah Nusser (Chair), Iowa State University; Ramanathan Guha, Google; Ian Foster, The University of Chicago; and Christine Kirkpatrick, University of California, San Diego.

STAFF Thomas Arrison, Director, Board on Research Data and Information (BRDI); George Strawn, scholar, BRDI; and Emi Kameyama, Program Officer, BRDI.

SPONSORS This workshop was supported by the National Institutes of Health (HHSN263201800029I/75N98023F00027) and Schmidt Futures.

For additional information regarding the workshop, visit: www.nas.edu/brdi.

SUGGESTED CITATION National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. https://doi.org/10.17226/27786.

Policy and Global Affairs

Copyright 2024 by the National Academy of Sciences. All rights reserved.

images
Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.
Page 1
Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.
Page 2
Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.
Page 3
Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.
Page 4
Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.
Page 5
Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.
Page 6
Suggested Citation: "Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief." National Academies of Sciences, Engineering, and Medicine. 2024. Accelerating and Deepening Approaches to FAIR Data Sharing: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27786.
Page 7
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.