Dryad is seeking an energetic and enthusiastic Executive Director, ideally with experience in scientific or biomedical research, librarianship, or publishing, to oversee development and operation of the organisation during a period of rapid growth and transformation. The role reports to the Board of Directors. Externally, the postholder will be responsible for building relationships with stakeholders, customers and users of the Dryad Digital Repository. Internally, key responsibilities include organisational leadership and ensuring Dryad meets its objectives through sound financial management and oversight of day-to-day operations, with the support of a small but growing staff. Review of applications will begin by September 1, 2014 and continue until the position is filled. For details please see the full position description and for inquiries please contact firstname.lastname@example.org.
A number of notable publications have been added to the growing list of those integrating submission of data and manuscripts with Dryad.
The recently adopted Data Policy of Royal Society Publishing now requires that data sets “be deposited in an appropriate, recognized, publicly available repository” and that authors “disclose upon submission of the manuscript any restrictions on the availability of research materials or data.” To support this policy, the Royal Society now sponsors the Data Publication Charge to Dryad for data associated with any of its publications. Proceedings of the Royal Society, Proceedings B, the Royal Society’s flagship biological research journal, has joined Biology Letters in integrating submission with Dryad. Currently, submission of data occurs prior to manuscript review for Proceedings B, and following acceptance for Biology Letters. Watch this space for further efforts to support the data archiving needs of Royal Society Publishing.
In an editorial entitled ‘An open future for ecological and evolutionary data?’, recently published jointly in BMC Ecology and BMC Evolutionary Biology, authors Amye Kenall, Simon Harold and Christopher Foote announce the integration of manuscript submission for these two journals with Dryad in order “to encourage a more widespread adoption of open data sharing in the fields of ecology and evolutionary biology by facilitating this process for our authors.” Now that the technical work has been accomplished for these two journals, submission integration can be easily extended to other BMC series titles at the request of the editors.
Scientific Data is a newly launched open access publication from Nature Publishing Group that aims to promote the accessibility and reuse of scientifically valuable data sets. This is supported by both a strong data deposition policy and a novel publication type called a Data Descriptor.
Data Descriptors will provide detailed descriptions of the experiments and procedures involved in generating important datasets, including essential information needed for scientists to assess the technical quality of the data, reproduce key methods or analysis workflows, and ultimately reuse the data to address important research questions. In addition, every publication at Scientific Data will be supported by metadata describing key properties of the experiments and resulting data, which will be checked by an in-house curator and released in the ISA-tab format, and hopefully other standard formats in the future. These metadata will aid data mining, and will help scientists find and reuse high-quality datasets stored across multiple data repositories.
NPG sponsors data submissions associated with Scientific Data, and data are submitted to Dryad prior to review.
Together with a number of previously integrated journals from German Medical Science and Pensoft Publishers, on subjects ranging from subterranean biology to reconstructive surgery, the total number of titles now integrated with Dryad exceeds 50. Authors may consult this list to see which journals are integrated, when to submit data (either before review of after acceptance), whether the journal allows an optional data embargo, and whether Data Publication Charges are sponsored for that publication.Submission integration is a free service, and can be implemented with a wide variety of manuscript submission systems. We encourage publishers and editors to contact us about integration of additional titles, and we encourage authors to let editors know if this is a feature that they would value.
The following is a guest post from Tom Jefferson of The Cochrane Collaboration, Peter Doshi of the University of Maryland and Carl Heneghan from the University of Oxford. We asked them to tell the story behind their recent Cochrane systematic review  and dataset in Dryad  which holds valuable lessons about the evidence-base on which major public health recommendations are decided. -TJV
In the late 2000s, half the world was busy buying and stockpiling the neuraminidase inhibitors oseltamivir (Tamiflu, Roche) and zanamivir (Relenza, GSK) in fear of an influenza pandemic.
The advice to stockpile for a pandemic and also use the drugs in non-pandemic, seasonal influenza seasons came from such august bodies as the World Health Organization (WHO), the US Centers for Disease Control and Prevention (CDC) and its European counterpart, the ECDC. However, they were stockpiling on the basis of an unclear rationale, mixing the effect of the antiviral drugs on the complications of influenza (mainly pneumonia and hospitalizations) and their capacity to slow down viral spread giving time for vaccines to be crash produced and deployed.
It has since become clear that none of these parties had seen all the clinical trial evidence for these drugs. They had based their recommendations on reviews of “the literature” which sounds impressive, but in fact refers to short trial reports published in journal articles rather than the underlying detailed raw data. For example, key assumptions of antiviral performance found in the US national pandemic plan trace back to a six page long journal article written by Roche which reported on a pooled-analysis of 10 randomized trials of which only 2 have ever been published.
In contrast, each of the corresponding internal clinical study reports for these 10 trials runs thousands of pages (for background on what clinical study reports are, see here.) Despite the stockpiling, these reports have never been reviewed by CDC, ECDC, or WHO. The WHO and CDC both refused to answer our questions on the evidence base for their policies.
Our Cochrane systematic review of neuraminidase inhibitors, funded by the National Institute for Health Research in the UK, was based on analysis of the full clinical study reports for these drugs, not short journal publications. We obtained these reports from the European Medicines Agency, Roche, and GlaxoSmithKline. It took us nearly four years to obtain the full set of reports. The story of how we got hold of the complete set of clinical trials with no access restrictions is told in our essay “Multisystem failure: the story of anti-influenza drugs”.
With the publication of our review, we are making all 107 full clinical study reports publicly available. If you disagree with our findings, if you want to carry out your own analysis or if you are just curious to see what around 150,000 pages of data look like, they are one click away. Now the discussion about how well these drugs work can happen with all parties able to independently analyze all the trial evidence. This is called open science.
Be aware that there are some minimal redactions carried out by GSK and Roche. They did this to protect investigator and participant identity. While protecting participant identity is understandable, the EMA carries a different view towards protecting investigator identity: “names of experts or designated personnel with legally defined responsibilities and roles with respect to aspects of the Marketing Authorisation dossier (e.g. QP, QPPV, Clinical expert, Investigator) are included in the dossier because they have a legally defined role or responsibility and it is in the public interest to release this data”.
- Jefferson T, Jones MA, Doshi P, Del Mar CB, Hama R, Thompson MJ, Spencer EA, Onakpoya I, Mahtani KR, Nunan D, Howick J, Heneghan CJ (2014) Neuraminidase inhibitors for preventing and treating influenza in healthy adults and children. Cochrane Database of Systematic Reviews, online in advance of print. doi:10.1002/14651858.CD008965.pub4
- Jefferson T, Jones MA, Doshi P, Del Mar CB, Hama R, Thompson MJ, Spencer EA, Onakpoya I, Mahtani KR, Nunan D, Howick J, Heneghan CJ (2014) Data from: Neuraminidase inhibitors for preventing and treating influenza in healthy adults and children. Dryad Digital Repository. doi:10.5061/dryad.77471
We are delighted to announce the availability of the data underlying the book “40 Years of Evolution” by Peter and Rosemary Grant. In this new book, the Grants give an account of their classic, long-term study of Darwin’s finches on one of the Galápagos Islands. From the announcement by Princeton University Press.
The authors used a vast and unparalleled range of ecological, behavioral, and genetic data–including song recordings, DNA analyses, and feeding and breeding behavior–to measure changes in finch populations on the small island of Daphne Major in the Galápagos archipelago. They find that natural selection happens repeatedly, that finches hybridize and exchange genes rarely, and that they compete for scarce food in times of drought, with the remarkable result that the finch populations today differ significantly in average beak size and shape from those of forty years ago. The authors’ most spectacular discovery is the initiation and establishment of a new lineage that now behaves as a new species, differing from others in size, song, and other characteristics. The authors emphasize the immeasurable value of continuous long-term studies of natural populations and of critical opportunities for detecting and understanding rare but significant events.
“40 Years of Evolution”, which is written a style that will be accessible to researchers, students and a more general audience, includes over 100 line drawings illustrating quantitative patterns among the many variables the authors have studied. There are 82 data files being made available in Dryad for researchers and students to explore the numbers behind those figures. We are proud to be the custodians of this unique scientific resource.
For students and teachers interested in the Grants’ long-term studies of Darwin’s Finches, we also recommend the excellent background material and hands-on data analysis activities from the HHMI BioInteractive site.
Data citation: Grant PR, Grant BR (2013) Data from: 40 years of evolution. Darwin’s finches on Daphne Major Island. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.g6g3h
Updates: The originally scheduled keynote address from Phil Bourne will instead be a session on “The Future of Open Data – What to Expect from US Funders” with Jennie Larkin, Deputy Associate Director for Data Science at NIH and Peter McCartney, Program Director in the Division of Biological Infrastructure at NSF. Also, doors will open at 8:30 for a reception, at which light breakfast will be served.
We’re pleased to announce that our 2014 Community Meeting will be held on May 28 at the Institute for Quantitative Social Science at Harvard University. This year’s meeting is being held jointly with the Dataverse Network Project, and the theme is Working Together on Data Discovery, Access and Reuse.
Many actors play a role in ensuring that research data is available for future knowledge discovery, including individual researchers, their institutions, publishers and funders. This joint community meeting will highlight existing solutions and emerging issues in the discovery, access and reuse of research data in the social and natural sciences.
Keynote speaker Dr. Phil Bourne is the first and newly appointed Associate Director for Data Science at the National Institutes of Health and a pioneer in furthering the free dissemination of science through new models of publishing. Prior to his NIH appointment, he was a Professor and Associate Vice Chancellor at the University of California San Diego. He has over 300 papers and 5 books to his credit. Among his diverse contributions, he was the founding Editor-in-Chief of PLOS Computational Biology, has served as Associate Director of the RCSB Protein Data Bank, has launched four companies, most recently SciVee, and is a Past President of the International Society for Computational Biology. He is an elected fellow of the American Association for the Advancement of Science, the International Society for Computational Biology and the American Medical Informatics Association. Other honors he has received include the Benjamin Franklin Award in 2009 and the Jim Gray eScience Award in 2010.
The meeting will run from 8:30
9:00 am – 2:15 pm, including light breakfast and a catered lunch. It will be followed by a Dryad Members Meeting, open to all attendees, from 2:30 – 3:30 pm.
There is no cost for registration, but space is limited. Onsite registration will be made available if space allows, and the proceedings will also be simulcast online. Please see the meeting page for details.
This year’s Community Meeting has been scheduled for the convenience of those attending the Society for Scholarly Publishing Annual Meeting from May 28-30 in Boston. SSP attendees may also wish to attend the session “The continuum from publishers to data repositories: models to support seamless scholarship” May 29th from 10:45am-12:00pm.
For inquiries, please contact Laura Wendell (email@example.com) or Mercè Crosas (firstname.lastname@example.org).
Ecology and Evolution is the latest journal to integrate submission of manuscripts with data to Dryad. Ecology and Evolution is a Wiley open access journal supported by other journals published by Wiley, including journals owned by the British Ecological Society, the European Society for Evolutionary Biology and the Society for the Study of Evolution.
Ecology and Evolution’s integration with Dryad means that all authors will be invited to archive the data supporting the conclusions in the article, and their process of depositing data files has been simplified by the behind-scenes-coordination between the journal and the repository. Authors are invited to submit data to Dryad when their manuscript is accepted, and have the option to set a one-year embargo on the availability of their data files. There are already 50 articles in the journal with their underlying data archived in Dryad.
The journal has a strong data policy, requiring “as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, the Knowledge Network for Biocomplexity or other suitable long-term and stable public repositories.”
Editor-in-chief Allen J. Moore says “We are fully behind Dryad… and I think things are going well.” Moore is a strong proponent of open data, a former Dryad Board member, and an experienced data depositor.
The journal covers Data Publishing Charges for its authors.
This has been a very interesting and positive collaborative process and has involved a number of groups and committed individuals. Encouraging the practice of data citation, it seems to me, is one of the key steps towards giving research data its proper place in the literature.
As the preamble to the draft principles states:
Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record. In other words, data should be considered legitimate, citable products of research. Data citation, like the citation of other evidence and sources, is good research practice.
In support of this assertion, and to encourage good practice, we offer a set of guiding principles for data citation.
Please do comment on these principles. We hope that with community feedback and support, a finalised set of principles can be widely endorsed and adopted.
Discussion on a variety of lists is welcome, of course. However, if you want the Synthesis Group to take full account of your views, please be sure to post your comments on the discussion forum.
Some notes and observations on the background to these principles
I would like to add here some notes and observations on the genesis of these principles. As has been widely observed there have been a number of groups and interested parties involved in exploring the principles of data citation for a number of years. Mentioning only some of the sources and events that affected my own thinking on the matter, there was the 2007 Micah Altman and Gary King article, in DLib, which offered ‘A Proposed Standard for the Scholarly Citation of Quantitative Data’ and Toby Green’s OECD White Paper ‘We need publishing standards for datasets and data tables’ in 2009. Micah Altman and Mercè Crosas organised a workshop at Harvard in May 2011 on Data Citation Principles. Later the same year, the UK Digital Curation Centre published a guide to citing data in 2011.
The CODATA-ICSTI Task Group on Data Citation Standards and Practices (co-chaired by Christine Borgman, Jan Brase and Sara Callaghan) has been in existence since 2010. In collaboration with the US National CODATA Committee and the Board on Research Data and Information, a major workshop was organised in August 2011, which was reported in ‘For Attribution: Developing Data Attribution and Citation Practices and Standards’.
The CODATA-ICSTI Task Group then started work on a report covering data citation principles, eventually entitled ‘Out of Cite, Out of Mind’ – drafts were circulated for comment in April 2013 and the final report was released in September 2013.
Following the first ‘Beyond the PDF’ meeting in Jan 2011 participants produced the Force11 Manifesto ‘Improving Future Research Communication and e-Scholarship’ which places considerable weight on the availability of research data and the citation of those data in the literature. At ‘Beyond the PDF II’ in Amsterdam, March 2013, a group comprising Mercè Crosas, Todd Carpenter, David Shotton and Christine Borgman produced ‘The Amsterdam Manifesto on Data Citation Principles’. In the very same week, in Gothenburg, an RDA Birds of a Feather group was discussing the more specific problem of how to support, technologically, the reliable and efficient citation of dynamically changing or growing datasets and subsets thereof. And the broader issues of the place of data and research publication were being considered in the ICSU World Data Service Working Group on Data Publication. This group has, in turn, formed the basis for an RDA Interest Group. Oooffff!
How great a thing is collaboration?
From June 2013, as the Force11 Group was preparing its website and activities to take forward the work on the Amsterdam Manifesto, calls came in from a number of sources for these various groups and initiatives to coordinate and collaborate. This was admirably well-received and from July the ‘Data Citation Synthesis Group’ had come into being with an agreed mission statement:
The data citation synthesis group is a cross-team committee leveraging the perspectives from the various existing initiatives working on data citation to produce a consolidated set of data citation principles (based on the Amsterdam Manifesto, the CODATA and other sets of principles provided by others) in order to encourage broad adoption of a consistent policy for data citation across disciplines and venues. The synthesis group will review existing efforts and make a set of recommendations that will be put up for endorsement by the organizations represented by this synthesis group.
The synthesis group will produce a set of principles, illustrated with working examples, and a plan for dissemination and distribution. This group will not be producing detailed specifications for implementation, nor focus on technologies or tools.
As has been noted elsewhere , the group comprised 40 individuals and brought together a large number of organisations and initiatives. What followed over the summer was a set of weekly calls to discuss and align the principles. I must say, I thought these were admirably organised and benefitted considerably from participants’ efforts to prepare documents comparing the various groups’ statements. The face-to-face meeting of the group, in which a lot of detailed discussion to finalise the draft was undertaken, was hosted (with a funding contribution from CODATA) at the US National Academies of Science between the 2nd RDA Plenary and the DataCite Summer Meeting (which CODATA also co-sponsored). It has been intellectually stimulating and a real pleasure to contribute to these discussions and to witness so many informed and engaged people bashing out these issues.
The principles developed by the Synthesis Group are now open for comment and I urge as many people, researchers, editors and publishers as possible who believe that data has a place in scholarly communications to comment on them and, in due course, to endorse them and put them into practice.
Are we finally at the cusp of real change in practice? Will we now start seeing the practice of citing data sources become more and more widespread? It’s soon to say for sure, but I hope these principles, and the work on which they build, have got us to a stage where we can start really believing the change is well underway.
Simon Hodson is Executive Director of CODATA and a member of the Dryad Board of Directors. This post was originally published on the CODATA blog.