One of the key drivers of Dryad’s success in the years since we first went online in 2008 is our connection to journals and the research publication workflow. While now supporting data submission at any point in the research process, and showcasing data and analysis as independently important contributions to research and discovery, our ability to capture data at a crucial point in the investigator’s path to publication remains very powerful.
Our partnership with journals was first articulated in 2011 by the authors of the Joint Data Archiving Policy (JDAP) – many of whom helped found Dryad – which indicated, “Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future,” and that “as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive…”
This spirit is alive and well now – in 2023 – as demonstrated by a move in March by the American Association for the Advancement of Science (AAAS) to strengthen the scientific record by requiring all data underlying the results in published papers to be publicly and immediately available. Those authors state:
“Ensuring the accuracy of the scientific record is a community process. Researchers, journals, reviewers, image sleuths, and readers all contribute to making sure that the literature is accurate. However, each element of the ecosystem is imperfect. Researchers make inadvertent, and sometimes deliberate, errors. Reviewers don’t always examine every detail. Journals make judgment calls that are not always correct. But collectively, and with the engagement of a wide community that has access to underlying data, a vetted and validated record is possible.”
The AAAS designated Dryad as the exclusive partner to simplify the deposit of data not having a home elsewhere.
Most recently, our home communities have raised the bar again. In a May editorial, the academic editors and publishers of leading journals in ecology and evolution wrote that data-sharing efforts are falling short, and the potential for the reuse of data “in either replicated studies, or in metanalyses that require further analytics, or their use in generating novel results or syntheses is reduced.”
They have called for “a requirement, as a condition for publication, that authors provide all raw data and metadata, code, programming scripts, and bespoke software necessary for fully replicating any analyses that lead to inferences made in a published study.” Their emphasis on metadata and any materials needed to fully replicate the analysis is key here.
Dryad welcomes this progressive new stance and applauds our founding communities. We’re also pleased to report that our curation and publication processes align well with the standards set: we support deposit of both data and code, collection of metadata, versioning of datasets, linking data with related outputs, provision of a permanent digital identifier, unrestricted and immediate access, and other specific recommendations that follow publication best practices and meet requirements of data sharing mandates. And, we help peer reviewers to assess whether the author has provided the materials needed for “fully replicating any analyses that lead to inferences made in a published study” by making data and software available to them in our “Private for Peer Review” mode.
Here’s how Dryad’s service aligns with each of the features of the data publishing framework the authors describe.
- Detailed metadata with a README file, describing relevant details about data collection, processing, analysis and presentation – Dryad curators thoroughly review the README shared by the author to ensure there is enough documentation to contextualize and understand the files within the submission for both seasoned professionals within the field and those who may be relatively new to the discipline. Spatiotemporal data, taxonomic data, and genetic data may be submitted by the author. We do not yet invoke controlled vocabularies for genetic metadata.
- Organized and clearly labeled data tables and files – Our curators review this with the support of automated tabular-data checks.
- Clearly outlined steps for data processing as described in the associated study – Any technical or methodological information that may help others to understand how the data were generated (i.e. equipment/tools/reagents used, or procedures followed) can be provided in the Methods section of the Dryad submission form.
- If bespoke scripts, analysis, or modeling methods were used, all associated programming scripts, software, and code required to run any analyses used in the study – Authors have the option to upload associated software,code and/or supplemental files to Zenodo concurrently with their submission to Dryad. Our curators will check to ensure software/code included with the data files are sufficiently explained in the README.
- Clear and consistent file naming, avoiding long names, spaces, and special characters – Curators offer recommendations to help ensure file names and directories are presented in a consistent, logical, and descriptive manner.
- References to other data, where applicable – Authors are encouraged to include links to other data, and other research outcomes, as part of their submission to Dryad.
- Data should be stored in an open and re-usable format – Dryad requires that data files are accessible using open, non proprietary software wherever possible.
- Clearly stated license under which the data are distributed (e.g., Creative Commons) – We ask authors to confirm their understanding that data published with Dryad will be made available with a Creative Commons license waiver (CC0) and display this license prominently on our data publication pages.
All Dryad data is published CC0 and is freely available. Data publications may be embargoed on rare occasions. Every data publication at Dryad has a unique DOI and a clear data citation. Dryad records are permanent and updates are clearly indicated and dated.
We’re so pleased to be able to support these pioneering initiatives in promoting the reproducibility of research through open sharing of data and code. We look forward to working with our community of journal editors across disciplines to help develop the Dryad service to meet their needs and our vision to accelerate discovery and translate research into benefits for society worldwide through the open availability and routine reuse of all research data.
Questions about the Dryad curation and publication process are very welcome, to hello [at] datadryad [dot] org.
Academic societies and publishing organizations not already members of Dryad may be interested in joining and participating in these discussions. You can learn more here.