NIDDK Guidance for Writing a DMS Plan
The National Institutes of Health (NIH) Data Management and Sharing (DMS) policy expects that researchers maximize the appropriate sharing of scientific data. The DMS Plan should describe how the scientific data generated will be managed and shared. NIH has outlined six elements that should be included in each submitted DMS Plan. The Writing a Data Management & Sharing Plan section of sharing.nih.gov describes each of the recommended elements.
The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) provides Institute-specific DMS Guidance, which builds upon the NIH guidance, for the following DMS Plan Elements:
- Data Type
- Related Tools, Software, and/or Code
- Data Preservation, Access, and Associated Timelines
- Access, Distribution, or Reuse Considerations
- Oversight of Data Management and Sharing
NIDDK expects investigators will:
Detail the types of data that will be generated.
- Data should be managed at the individual level (experiment, cell, animal, participant), whenever possible. If the data type is at an aggregate/population level (e.g., z-stack images), a description of the aggregation method/process should be provided. (See “DMS Plan Element: Access, Distribution, and Reuse Considerations” section for more information about depositing/sharing aggregate data.)
Specify data types that are governed by additional policies, for example:
- Genomic data are subject to the Genomic Data Sharing (GDS) policy and the DMS policy. (See NOT-OD-22-198.)
- Human subjects’ data must be handled in a way that is consistent with protecting research participant privacy. (See NOT-OD-22-213.)
- All other policies governing human subjects research must be adhered to. (Refer to the intramural research guidelines as needed.)
- Indicate any data types that will not be shared and include a justification.
- In the event that another applicable policy has more detailed expectations than that of the DMS policy (e.g., GDS policy, requirements in the Notice of Funding Opportunity [NOFO], other NIH policies), the investigator is required to comply with the more detailed policy.
- Clinical data guidelines (e.g., Privacy Protection, Human Data Sharing, and Health Insurance Portability and Accountability Act [HIPAA], when applicable) must be followed in addition to all DMS policies and human research policies, as appropriate.
- NIDDK expects investigators to list and describe custom code or software used to generate the scientific data, and to share the raw data and results used to validate the software for reproducibility purposes and to maximize data reuse after sharing.
NIDDK strongly encourages the use of open-source software/algorithms/resources for the data
management and analysis.
- Should investigators choose to use proprietary software/algorithms/resources, NIDDK requires the rationale and justification for their use to be included in the DMS Plan.
NIDDK expects that all open-source code/software/tools used to develop the published
or submitted dataset are shared at the time of data submission or publication.
- GitHub or Docker images are resources that may allow easy bundling of datasets with their associated code/tools for processing.
- NIDDK expects that raw data and intermediate data accompanying the scientific data are shared at the time of data submission or publication.
- Any novel analytic pipelines or workflows must be shared, with appropriate documentation, to enable another investigator to recreate the analysis.
Data standards relate to how data are organized, documented, and formatted. Data standards, vocabularies, ontologies, and terminologies for a proposed study are dependent on the data types being collected.
NIDDK expects that investigators will use data standards widely accepted within the community. Additionally, NIDDK strongly encourages the use of data standards that may promote the creation of machine-readable datasets.
- Investigators may choose to utilize existing NIH CDE resources; for example, the NIH CDE Repository.
- Specific NOFOs may require the use of certain data standards. Investigators should review the NOFO closely and ensure the DMS Plan reflects the required standard(s).
- Investigators should describe the data standards that will be applied for each data type to be collected.
More information on how to document data standards and metadata for different data types in a DMS plan can be found on the DMS Plan Worksheet and DMS Plan Examples. Additional examples of metadata and data standards to aid in addressing this element are available in the Tools and Resources page.
Metadata is information about the data itself that provides context to interpret, analyze, or combine the study data and may also make study data more findable. The specific metadata and associated documentation for a proposed study will vary by scientific area, study design, type of data collected, and characteristics of the dataset. Metadata can include methodology and procedures used to collect the data, data labels, definition of variables, and any other information necessary to reproduce and understand the data.
Investigators are expected to describe the anticipated metadata to be submitted with datasets.
- Datasets should have appropriate metadata to enhance their FAIRness (see Wilkinson et al.  for more details).
- Metadata should be sufficient such that investigators reusing the dataset can reproduce the findings.
- NIDDK strongly recommends the use of machine-readable metadata, such as through use of the Center for Expanded Data Annotation and Retrieval (CEDAR) Workbench, when practical.
- The Data and Metadata Standards Examples for DMS Plans tool provides several illustrative examples of data standards and metadata schema relevant to NIDDK research.
Desired Repository Characteristics and Selection
- Investigators must indicate the repository(ies) where the scientific data generated will be archived.
- NIDDK expects that investigators will select repositories that meet the NIH desirable characteristics of data repositories.
- If an investigator plans to utilize a repository that does not meet such characteristics, a justification must be included.
- Investigators need to consider the type of data they will be submitting when selecting a repository. A short justification of the choice of repository for each data type must be included.
- NIDDK strongly encourages investigators to consider the following factors when selecting a repository:
- NOFO specified repository (ies).
- Organism, domain, or data type-specific repositories.
- Whether controlled access to data is required (e.g., for protection of human subjects’ privacy).
- Studies may generate various types of data from a given individual (e.g., genomic and metabolomic data
from a single participant).
- NIDDK does not recommend submitting the same data to multiple repositories; rather, different data types should be submitted to data type- or domain-specific repositories and linked through appropriate metadata to facilitate findability and reuse.
- Investigators are encouraged to submit data to data type-specific repositories (e.g., database of Genotypes and Phenotypes [dbGaP] and Metabolomics Workbench [MetWB]) when appropriate, taking advantage of data type-specific quality control and metadata schema in each repository.
- The NIH Generalist Repository Ecosystem Initiative (GREI) is intended to supplement domain-specific data repositories. If an investigator chooses to submit all data to a generalist repository rather than one or more data type-specific repositories, a justification must be provided.
- Submission to repositories which only hold metadata or summary data (such as ClinicalTrials.gov, Accelerating Medicines Partnership in Type 2 Diabetes [AMP]-T2D,
and AMP Common Metabolic Diseases [AMP-CMD]) do
not fulfill the desirable characteristics under the DMS policy, nor do they meet the
NIDDK expectation for sharing of primary data. In these cases, primary data should be submitted to a
repository that complies with DMS policy requirements.
- Requirements for study registration and submission of summary data to ClinicalTrials.gov remain in effect. (See NOT-OD-16-149)
Repositories and tools to find repositories
- dkNET - lists repositories used by NIDDK-supported researchers
- NIH-supported repositories
- Generalist Repositories
- NIDDK Central Repository
- Additional resources to help in "Selecting a Repository"
Data Preservation and Submission Timelines
Per NIH policy, investigators must state in the DMS Plan when their data will be shared, and how long
the data will be available. NIDDK expects investigators to provide additional information on the
following, as applicable:
- Investigators must comply with any timeline for data sharing or longevity of availability, or instructions specified in the NOFO or the NIH Grants Policy Statement on retention of data. (See the HHS Office of Research Integrity's tutorial on retention of data.)
- Investigators must indicate repositories chosen and the longevity of data preservation to demonstrate that these policies are sufficient.
- Consider scientific value and impact over cost when deciding data availability timeframe.
- Investigators must preserve their data for a time period and in a manner that is consistent with
all applicable human subjects research regulations.
- Required registration and reporting for clinical research with ClinicalTrials.gov, as described underNOT-OD-16-149, remain in effect.
- Human subjects research involving personal identifiable information (PII) and protected health information (PHI) under HIPAA require External link retention for 6 years, but some states have additional regulations and time periods. When such a policy exists, the investigator will be expected to comply with it.
- Journal policies may include requirements and time frames for data availability. They may range from 3 years to more than 10 years (eLife, Nature, JBC), with 5-6 years being the most common.
- Note: Data retention is separate and distinct from any secondary use preservation timelines, which are covered in the next element.
Investigators must address these topics in the DMS Plan when the research will involve human participants:
Provide an explanation of how patient privacy will be protected.
- Repository selection should be done with the consideration for human data privacy.
- The investigator is expected to include a statement on measures taken in the study and by the repository to address re-identification risk.
- Whether consent allows for re-use of data, and any limitations on the scope of data re-use.
- Use of machine-readable informed consents is encouraged whenever possible.
- A copy of the consent form may be provided, or a statement that the investigator is willing to share the consent document with NIDDK in the future.
- NIDDK encourages investigators to make their data as openly available as possible.
- When controlled access is required, such as to protect the privacy of human participants, investigators should explain the process needed to access the data.
NIDDK strongly encourages the sharing of individual level data wherever possible.
- When individual-level data will not be shared (i.e., when data are aggregated), a justification must be included in the DMS Plan.
- Investigators are encouraged to utilize Global Unique Identifiers (GUIDs) whenever possible to link participants in multiple studies, taking appropriate steps to minimize re-identification risk.
In the event that another applicable policy has more detailed expectations than that of the DMS policy, the investigator is required to comply with the more detailed policy.
Specific guidance for the Intramural Research Program (IRP) program Office of Intramural Research
(OIR) includes the following:
- IRP human data sharing OIR guidance and policy.
- IRP investigators should meet all extramural requirements as well.
- Genomic Data Sharing (GDS) requirements (NOT-OD-22-198) may apply simultaneously with the new DMS policy.
- Clinical data guidelines and human subjects’ data must be handled appropriately as detailed in NOT-OD-22-213 and other policies governing human subjects research.
- Investigators must describe how compliance with the approved DMS Plan will be monitored and the intended frequency of oversight review. The institutional official or designee (title and role) who will be responsible for oversight of compliance with the accepted DMS Plan must be included.
- Adherence to the plan could impact future funding decisions for that institution, thus the responsible party may be at the institution level. The institutional official or designee is confirming that practices/procedures are in place to monitor that the investigator has established and is adhering to the DMS plan.
Any updates to the DMS Plan need to be approved by NIDDK. Therefore, NIDDK expects that
investigators will work proactively with the Program Officer assigned to their application through
their institutional official (or with the Scientific Director’s office for NIDDK intramural
to obtain review and approval of modifications when any changes or updates to the DMS Plan are
needed. (See NOT-OD-23-185 for how to request revisions to an approved DMS Plan.) Examples that may require updates to the DMS Plan may include, but are not limited to,
- The type(s) of data generated change(s).
- A different data repository(ies) is(are) chosen for submission.
- Adjustments to the sharing timeline.
- NIDDK requires investigators to include an update on progress made toward fulfillment of the DMS activities during the progress report(s), including the final progress report. More details on adherence to the approved DMS Plan are provided by NIH.
- NIDDK strongly encourages the use of a persistent identifier, such as a Digital Object Identifier (DOI), when data are submitted, in the progress reports, and in all related publications.
The information will be updated as additional policy or guidelines are established and as new resources are released.