Connecting the Dots of Research

A scuba diver attaches a box to a coral reef

By Maria Praetzellis, Brian Riley, John Chodacki, Catherine Nancarrow, and Marisa Strong. Researchers at the University of California create enormous quantities of data and research outputs every day. The challenge is how do we track and understand how this data is connected? How can we build systems to assess the downstream use of openly available research data in order to assess its impact on knowledge creation? The DMPHub is a new tool developed by the California Digital Library (CDL) that advances data management policies and local requirements for sharing data to facilitate and accelerate the research process.

Open data policies are proliferating worldwide and researchers are required to submit data management plans (DMPs) with most proposals for grants or funding to support their work. DMPs describe the data that will be generated in the research project and outlines plans for sharing and preserving this data with the community. Currently, researchers have neither incentives nor easy methods for updating a DMP after a project is underway, leading to poor data stewardship practices overall and often resulting in research data that may not be understood or reused by others.

DMPs traditionally have been basic text documents that outline plans and practices. This static, narrative format poses challenges for stakeholders across the research ecosystem, for example, funders who must monitor compliance with data-sharing requirements by manually checking the status of a project’s handling of research data.

This problem can be solved by the DMPHub, a new platform that repositions DMPs from text-based planning documents to dynamic hubs in the research ecosystem. The DMPHub automatically updates key information across all stakeholders integrated in the research process with other core systems and tools. For example, when datasets are deposited into a data repository from Rspace, an electronic lab notebook system, citation information for that deposit is automatically updated for the related DMP.

The DMPHub project originated from an NSF EAGER grant. It aimed to convert DMPs from the current static text documents into a structured format with robust metadata and the assignment of a persistent identifier for the DMP. The research project metadata and unique identifier together support a DMP that is structured so that the rich information it contains can be shared pragmatically between systems. This new structured document allows for automated notifications, verification, and reporting in real-time. An example of this sharing would be a grants management system using the real-time DMP to verify compliance with funder requirements for data sharing.

The DMPHub generates unique and persistent identifiers called DMP IDs for data management plans. It also stores unique identifiers for a variety of related identifiers discoverable at different stages of the research project. The DMP ID creates an unbreakable link between the plan, and the research project’s contributors, outputs, data repositories, and funding sources. This link allows stakeholders at all phases of the research project to use the DMPHub as a means of visualizing all of these connections.

The CDL will continue to release new features to expand the possibilities of the DMPHub, helping to ensure transparency in the research process and promote good data management practices for UC researchers. Many of these new workflows are currently being pilot tested as part of the NSF-funded FAIR Island Project, which is a UC focused collaboration involving the CDL, the University of California Gump South Pacific Research Station, and the University of California Natural Reserves System. Through the FAIR Island Project, the CDL will be utilizing and building on the DMPHub to track all research outputs generated from work completed at UC managed field stations and natural reserves.

At a time when the University of California is looking for ways to mitigate the risk of data loss, the CDL team hopes the project can be leveraged by multiple departments across the entire system—from research offices to IT departments to libraries—to give everyone a better understanding of the amazing work that UC researchers do each day.

Note: the DMPHub won a Silver Award in the 2021 UC Sautter Award Program for Innovation in Information Technology.

Maria Praetzellis, product manager, University of California Curation Center (UC3), California Digital Library, UC Office of the President.Maria Praetzellis is product manager, University of California Curation Center (UC3), California Digital Library, UC Office of the President.

 

Brian Riley, technical lead, University of California Curation Center (UC3), California Digital Library, UC Office of the President.Brian Riley is technical lead, University of California Curation Center (UC3), California Digital Library, UC Office of the President.

 

John Chodacki, director, University of California Curation Center (UC3), California Digital Library, UC Office of the President.John Chodacki is director, University of California Curation Center (UC3), California Digital Library, UC Office of the President.

 

Catherine Nancarrow, associate director, University of California Curation Center (UC3), California Digital Library, UC Office of the President.Catherine Nancarrow is associate director, University of California Curation Center (UC3), California Digital Library, UC Office of the President.

 

Marisa Strong, application program manager, University of California Curation Center (UC3), California Digital Library, UC Office of the President.Marisa Strong is application program manager, University of California Curation Center (UC3), California Digital Library, UC Office of the President.

 

Leave a Comment

Your email address will not be published.