Data sharing best practices

Like journals, Dryad has publication standards that all submissions must adhere to in order to be approved for publication. These standards, and the manual evaluation of all submissions by our team of data curators to assess whether datasets meet them, ensure that all datasets we publish align with data sharing best practices. The majority of submissions are returned to authors with some queries that require redress before we can approve them for publication.

You can make sure your dataset is published rapidly and smoothly by avoiding some of the most common reasons we send data back to authors for revision.

Insufficient detail in the README

This is, by far, the most common reason that datasets are returned to authors for revision and the first file that curators open when they begin evaluating a data submission. Curators thoroughly review the README to ensure that there is enough documentation to contextualize and understand the files within the submission for both seasoned professionals within the field and those who may be relatively new to the discipline. Details that may seem obvious to you as the data creator can be confusing for future users.

To avoid having to revise your README, follow these tips:

Explain the organization, structure, and content of the submitted files. List all folders, subfolders, and all files (groups of files or folders can be explained if they are repetitive). Nested sheets (e.g., Excel), dataframes (e.g., Prism), or arrays (e.g., MatLab, H5 files) also need to be described
Listed and define all variables, along with associated units or interpretation keys
Describe the naming convention of those files for repetitive files and folders
Include any context required to use the files (e.g., software packages/libraries)

Learn more about describing and organizing your data in our guide to data sharing best practices.

Generic title

Datasets with generic titles will be returned to the author for an updated title. The title of a data submission is the primary way that datasets are discovered on Dryad, so curators ensure that the title is brief, descriptive, and unique enough to distinguish it from other similar datasets. Titles should not include information such as author or journal name, publication year, or manuscript number and should be written in sentence syntax (i.e., separate words and use spaces, not underscores, hyphens, or similar filenaming punctuation). Authors can use the title of an associated manuscript with the prefix “Data from:”.

Here are some examples of generic titles that will hinder discoverability of the dataset and related research:

Raw data
Smith et al 2023 Data
Rabbit genetic data
Graphs for Clinical Data

Here are some great examples of titles that will help discoverability and align with data sharing best practices:

Computational analyses of dynamic visual courtship display reveal diet-dependent and plastic male signaling in Rabidosa rabida wolf spiders
Data from: Effects of understory characteristics on browsing patterns of roe deer in central European mountain forests
Data from: Global taxonomic, functional, and phylogenetic diversity of bees in apple orchards

Licensing issues

When you publish data you’ve generated on Dryad, you release it into the public domain under the terms of a Creative Commons Zero (CC0) waiver. The CC0 waiver allows future users to distribute, remix, adapt, and build upon the material in any medium or format, without restriction.

If your dataset includes data derived from other sources you must ensure that formal licensing or terms of service conditions do not conflict with CC0.

Any content that is currently copyrighted, even under a Creative Commons license other than CC0, cannot be redistributed on Dryad unless it is the copyright holder who intends to republish the content.

Dryad has partnered with Zenodo to host any software, scripts, code and/or supplementary files that accompany your data submission. Because non-data files are not always compatible with the CC0 license required by Dryad, you will have the opportunity to choose a separate license for your Zenodo files at the final stage of the submission process. Kindly note, all files uploaded as “Supplemental Information” will be licensed under CC BY.

For more information visit these links: Removing barriers to data reuse with CC0 licensing, Why Does Dryad Use CC0, and Some dos and don’ts for CC0.

Empty cells

Tabular data files are often submitted with non-structural empty cells within the dataframe (i.e., not used to divide dataframes or header rows). There is no consensus across all research disciplines regarding the best way to address empty cells (e.g., White et al., 2013), in large part because they can have myriad connotations depending on the variable and dataset.

Dryad’s general preference is that authors fill in empty cells with an appropriate numerical (e.g., 0, 999), text (e.g., n/a, inapplicable), or symbol (e.g., -) value that is explained in the README in order to minimize ambiguity for potential users. However, exceptions can be made if empty cells are important for analysis using a certain script or software, provided that these cells are explained in the README for users. If empty cells are included and no description of them is provided, the dataset will be returned to the author.

File format

Dryad accepts all file types, but there are instances where curators will send back the dataset for file format issues. These include, but are not limited to:

Proprietary file formats that cannot be opened without access to specialized software. We require that users across all operating systems must be able to access some version of the files with a freely available software.
Data in a format that is not optimized for reuse given the data type. For example, we will send back tabular data stored as a PDF or Microsoft Word document.
Files that produce errors when opening because the file has not been properly converted to the intended format, contains broken links, is password protected, or is corrupted.

Learn more about our preferred file formats.

Dryad news

The latest from the open data publishing platform & community committed to the open availability and routine re-use of all research data

Insufficient detail in the README

Generic title

Licensing issues

Empty cells

File format

We’re here to help

Share this:

Related

Discover more from Dryad news