Dryad newsletter January 2011

Credit: adamthelibrarian, from Flickr

This is an important month, because a host of our partner journals are implementing new policies on data archiving, and, in the U.S., the National Science Foundation is asking its new grantees to have explicit data management plans.  There are over 1000 data files from over 50 journals now in Dryad, and much of this content has been submitted only within the past year. Clearly, Dryad’s role in supporting the growing data archiving mandates from journals and funders continues to expand.

New Features
In the past few months, several new features have been added to Dryad.  Users can now save an incomplete submission and come back later to complete it.  They can see a listing of their completed and in progress submissions.  Users can download data citations to their favorite bibliography management programs and upload them to their favorite social bookmarking tools.  A new “faceted search” interface allows users to find data more easily, and also displays related content in other repositories, including ecological and environmental science data (from the Knowledge Network for Biocomplexity) and phylogenetic data (from TreeBASE). To provide an early indication of scientific impact, users can see how often data have been viewed and downloaded.

An important new feature is “handshaking”, which is what we call the process whereby authors upload some of their data to Dryad, and the information is conveyed behind-the-scenes to a specialized repository. The aim of handshaking is to reduce the time and effort need to deposit data when there are different repositories managing different aspects of the data.  Handshaking also enables persistent linkages among data in the different repositories. As a first foray into handshaking, we now offer users the option of initiating a deposit in TreeBASE, the primary repository for published phylogenetic data, whenever a NEXUS file is uploaded to Dryad.  Alternatively, the option is available to deposit in another repository first, and report the identifiers to Dryad to ensure that users can find all the data relevant to a given article.  We will be working in the months ahead to handshake with other specialized repositories required by our partner journals.

See our recent blog post about these features for more details.

Data Deposit in Three Easy Steps: The Movie
Are you looking for a way to show a colleague how straightforward data archiving can be?  We’ve added a short (2-minute) video to the site that walks users through the deposit process in three easy steps.  The video also available at SciVee.

Journals Implement Joint Data Archiving Policy
Starting this month, a number of Dryad partner journals have implemented a Joint Data Archiving Policy that requires, as a condition of publication, that authors deposit the data underlying their article in a public repository.  Some of the journals implementing this policy include: The American Naturalist, Evolution, Evolutionary Applications, Heredity, Journal of Evolutionary Biology, and Molecular Ecology. A recent TREE article by Michael Whitlock suggests how “data generators, data re-users, and journals can maximize the fairness and scientific value of data archiving.”

A growing number of journals now integrate their submission process with Dryad, meaning that the repository and journal exchange information to facilitate the author’s data deposition process and to ensure persistent linkage between articles and data. The current list includes The American Naturalist, The Biological Journal of the Linnean Society, Evolution, Journal of Evolutionary Biology, Journal of Heredity, Molecular Ecology, and Molecular Ecology Resources. And more are on the way (stay tuned).

NSF Data Management Plan Mandate
Starting this month, the U.S. National Science Foundation is requiring grant applicants to provide a data management plan describing how data will be collected, preserved and made available, and these plans will be subject to peer review.  We encourage applicants to leverage Dryad in their data management plans as a solution for the long-term preservation and dissemination of the data associated with their publications.  There are some pointers to resources for data management planning on the Dryad website.

Dryad UK Project
The Joint Information Science Committee (JISC) in the UK has made an award to Dryad and through Oxford University and the British Library to expand the scope of the journals involved, including into the areas of infectious disease and epidemiology, and to create a UK mirror of Dryad.  More information is here and at the Dryad UK site.

New Twitter Feed for Data Deposits
Interested in keeping up with new data available in Dryad?  Follow our Twitter feed (@datadryadnew) or subscribe to our RSS feed. We also Tweet general news about the repository and the world of data science as @datadryad.

Browse and search the repository at http://datadryad.org/
Follow Dryad on Twitter http://twitter.com/datadryad

This blog post is the first issue of the Dryad newsletter, summarizing recent achievements and milestones of the data repository.  If you’d like to receive future newsletters by email, please sign up for the Dryad Users mailing list.

Best practices for data archiving

Researchers working in data-intensive science, as well as science editors and publishers thinking about data policies, may want to take note of a new article by Michael Whitlock, Data archiving in ecology and evolution: best practices in the current issue of Trends in Ecology & Evolution.

Whitlock has long been a leader in advocating for data archiving and is the current Chair of the Dryad Consortium Board.  In this article he presents concrete suggestions for the what, how and when of data archiving.

But archiving is only half the equation.  Whitlock attempts to articulate sensible guidelines for data reuse, as well. Under what circumstances should researchers contact the original creators of the data set they are re-using, and when is co-authorship appropriate? How should authors properly acknowledge the original creators of the data?

Journals, editors, and publishers have an important role in promoting both data archiving and responsible data reuse.  One problem that merits broader discussion is how journals can conduct peer review so as to prevent data misuse.  Should researchers be given a chance to review manuscripts that report on new results reusing data that they originally published?  Or is it better to avoid the potential for conflict of interest (e.g. “how dare they not replicate my findings!”) and instead recruit independent experts?

Although the article is especially timely for those working in evolutionary biology and ecology, due to the recent adoption of mandatory data archiving at many of the leading journals in the field, these best practice recommendations are relevant across the sciences.

Michael C. Whitlock (2011) Data archiving in ecology and evolution: best practices, Trends in Ecology & Evolution,  26 (2): 61-65.  doi:10.1016/j.tree.2010.11.006.

Video shows how to deposit data in Dryad

Are you curious about what’s involved in depositing data in Dryad? looking for a quick way to show colleagues how straightforward data archiving can be?  Dryad’s new 2-minute video demonstrates the data deposit process from start to finish.

How to deposit data in Dryad

The video is embedded on the Dryad website, and also available on SciVee. Feel free to link to it and share it with colleagues.

Journals implement data archiving policy

It’s January 2011– do you know where your data are? 

It would be a good idea to know and be ready to deposit your files in a data repository, because this month marks the implementation of the Joint Data Archiving Policy.  The policy, endorsed by a consortium of prominent journals and societies, states that journals will require

as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive.

The policy can be customized by each journal, and enables both embargoes and editorial discretion to make special exceptions. Blanket exemptions apply to sensitive data such as identifiable human records and endangered species localities.

The journals (and corresponding societies) implementing the policy this month are:

  • The American Naturalist (American Society of Naturalists)
  • Evolution (Society for the Study of Evolution)
  • Evolutionary Applications
  • Heredity (The Genetics Society)
  • Journal of Evolutionary Biology (European Society for Evolutionary Biology)
  • Molecular Biology and Evolution (Society for Molecular Biology and Evolution)
  • Molecular Ecology
  • Systematic Biology (Society for Systematic Biology)

A sampling of the revised Instructions to Authors includes:

  • The American Naturalist: “The American Naturalist requires authors to deposit the data associated with accepted papers in a public archive. For gene sequence data and phylogenetic trees, deposition in GenBank or TreeBASE, respectively, is required. There are many possible archives that may suit a particular data set, including the Dryad repository for ecological and evolutionary biology data (http://datadryad.org). All accession numbers for GenBank, TreeBASE, and Dryad must be included in accepted manuscripts before they go to Production. Any impediments to data sharing should be brought to the attention of the editors at the time of submission.”
  • Journal of Evolutionary BiologyThe editors and publisher of this journal expect authors to make the data underlying published articles available. An investigator who feels that reasonable requests have not been met by the authors should correspond with the Editor-in-Chief. Authors must use the appropriate database to deposit detailed information supplementing submitted papers, and quote the accession number in their manuscripts.”
  • Molecular Ecology: “Data Accessibility: To enable readers to locate archived data from Molecular Ecology papers, as of January 2011 we will require that authors include a ‘Data Accessibility’ section after their references. This should list the data base and respective accession numbers for all data from the manuscript that has been made publicly available…. Please note that this section must be complete prior to the submission of the final version of your manuscript. Papers lacking this section will not be sent to Production.”

At Dryad, we have been working for some time now with editors and publishers at these and other partner journals to support the implementation of this policy. If you submit an article to a “JDAP journal,” you will be invited to simultaneously submit your data to Dryad. This may occur either prior to review or, depending on the journal, at the time your article is accepted. Dryad and the journal communicate behind the scenes to make it as easy as possible for you to deposit your data, and also ensure that a permanent, resolvable, and citable data identifier is published in the final article.  That way, in the future, no one need be frightened by the question “do you know where your data are?”

Health research funders sign on to goal of greater research data sharing

Asclepius statue

Statue of Asclepius, the Greek God of Medicine, from the Museum of Epidaurus Theatre. Image from: Wikimedia Commons, Licensed under: GFDL 1.3.

An international group of major health research funders have made a “joint statement of purpose” announcing, in strong and clear terms, their intent to promote greater sharing of research data.

As public and charitable funders of this research, we believe that making research data sets available to investigators beyond the original research team in a timely and responsible manner, subject to appropriate safeguards, will generate three key benefits: faster progress in improving health, better value for money, and higher quality science.

The 17 signatories (so far) include many major governmental funding agencies (e.g. US National Institutes of Health, the Wellcome Trust, The Centers for Disease Control, the UK Medical Research Council, Australia’s National Health and Medical Research Council, the Canadian Institutes of Health Research, France’s National Institute for Health and Medical Research, and the German Research Foundation), private foundations (e.g. the Bill & Melinda Gates Foundation and the Hewlett Foundation) and even international organizations such as the World Bank.  The group has invited additional funders to sign on to the statement.

Some of the long-term goals articulated in the document are near and dear to our hearts, in particular:

To the extent possible, datasets underpinning research papers in peer-reviewed journals are archived and made available to other researchers in a clear and transparent manner.


The human and technical resources and infrastructures needed to support data management, archiving and access are developed and supported for long-term sustainability.

An accompanying comment in The Lancet by Mark Walport of the Wellcome Trust and Paul Brest of the Hewlett Foundation (Sharing research data to improve public health, DOI:10.1016/S0140-6736(10)62234-9) raises some of the hard, but by now familiar, questions that will drive the approaches taken by the funding organizations: how to balance the rights and responsibilities of data generators and data users; how to safeguard and further the interests of the data subjects themselves; and how to ensure that the benefits of data sharing justify the expense and burden involved.

It will be very interesting to watch how the funding organizations work singly and in concert to overcome decades of cultural familiarity with data hoarding in the health sciences and, as Walport and Brent put it, “mend their ways.”