Feeds:
Posts
Comments

Archive for the ‘Data availability’ Category

A number of enhancements to the repository have been made in recent months, including these three that were in high demand from users:

  • First, we have modified our submission process to enable the data to be deposited prior to editorial review of the manuscript. Journals that integrate manuscript and data submission at the review stage can now offer their editors and peer reviewers anonymous access to the data in Dryad while the manuscript is in review. This option is currently being used by several of our partner journals, BMJ Open, Molecular Ecology, and Systematic Biology, and is available to any existing or future integrated journal. Note: authors still begin their data deposit process at the journal.
  • Second, when authors submit data associated with previously published articles, they can pull up the article information using the article DOI or its PubMed ID, greatly simplifying the deposition process for legacy data.
  • Third, Dryad now supports versioning of datafiles. Authors can upload new versions of their files to correct or update the original file. Once logged in to their Dryad account, the My Submissions option appears under My Account in the left side-menu. Prior unfinished and completed submissions are listed; selecting an archived submission allows the author to add a new file.  Note that the earlier versions of the file will still be available to users, but the metadata may be modified to reflect the reason for the update. The DOIs will be appended with a number (e.g., “.1”, “.2”) so that each version can be uniquely referenced.  By default, users will be shown the most current version of each datafile.  They will be notified of the existence of any previous/subsequent versions.
  • Access and download statistics have been displayed for content in the repository since late 2010; Dryad now displays the statistics for an article’s data together on one page so you can see at a glance how many times the page has been viewed and how many times each component data file has been downloaded. Check out this example from Evolutionary Applications.

Read Full Post »

Christopher Pirrone excavating an odontocete skull (photo by Robert Boessenecker)

Perhaps it’s understandable that paleontologists are committed to preserving the scientific record, since they spend a lot of time and energy finding and extracting shreds of evidence millions of years old.  Now, thanks to a partnership between Dryad and The Paleontological Society announced last year [1], coupled with strong data archiving policies adopted by two of its journals (Paleobiology and the Journal of Paleontology), a rich trove of data will be available for future researchers to unearth from Dryad.

For both journals, authors are being instructed to deposit the underlying data at the time their manuscript is submitted, so that editors and referees will be able to review it prior to acceptance.  Once published on Dryad, the data will be independently discoverable and citable, while at the same time prominently linked both to and from the original article.  Researchers are able to track the reuse impact of their data, independent of the citation impact of their article, by monitoring downloads from Dryad.

Preserved for ages.

Smilodon, by Charles Knight (1905), from a mural at the American Museum of Natural History.

Here’s an example from a recent issue of Paleobiology to sink your teeth into:

Article: Meachen-Samuels JA (2012) Morphological convergence of the prey-killing arsenal of sabertooth predators. Paleobiology 38(1): 1-14. doi:10.1666/10036.1

Data: Meachen-Samuels JA (2012) Data from: Morphological convergence of the prey-killing arsenal of sabertooth predators. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.h58q6

References:

[1]  Callaway E (2011) Fossil data enter the web period. Nature 472, 150. http://dx.doi.org/10.1038/472150a

Read Full Post »

doctor silencedA recent issue of BMJ highlighted the problem of missing clinical trial data from medical research, exploring both the causes and consequences of unpublished evidence.  One of the articles, from Andrew Prayle and colleagues [1], examined compliance with the US Food and Drug Administration’s ostensibly mandatory requirement that clinical trials report their results in ClinicalTrials.gov, as required by the the FDA Amendments Act (FDAAA) of 2007. Alarmingly, they found that only 22% of trials that should have reported results had actually done so.  Interestingly, industry-funded trials reported results at a higher frequency than other funders.  They conclude:

If the reporting rate does not increase, the laudable FDAAA legislation will not achieve its goal of improving the accessibility of trial results.

Fortunately for those interested in this research, the authors have ensured that their own data are available by depositing them in Dryad, where they have already been downloaded by over 100 users.

For more on the disturbing state of affairs in reporting of clinical trial data, we offer the irrepressible Ben Goldacre speaking at the Strata 2012 conference in February.

[1] Prayle AP, Hurley MN, Smyth AR (2012) Compliance with mandatory reporting of clinical trial results on ClinicalTrials.gov: cross sectional study. BMJ 343: d7373. doi:10.1136/bmj.d7373

Read Full Post »

Until recently, Mark Hahnel was a PhD student in stem cell biology. Frustrated by seeing how much of his own research output didn’t make it to publications, he endeavored to do something about it by developing a scientific file sharing platform called FigShare. Recently, Mark and FigShare were taken under the wing of Digital Science, a Nature Publishing Group spinoff, and a sleek new FigShare was relaunched in January 2012 with many more features and an ambitious scope.

FigShare allows researchers to publish all of their research outputs in seconds in an easily citable, sharable and discoverable manner. All file formats can be published, including videos and datasets that are often demoted to the supplemental materials section in current publishing models. By opening up the peer review process, researchers can easily publish null results, avoiding the file drawer effect and helping to make scientific research more efficient.

Users do not have to pay for access to the content: public data is made available under the terms of a CC0 waiver and other content under CC-BY.  And FigShare is currently providing unlimited public space and 1GB of private storage space for free.

This is a promising solution for getting negative and otherwise unpublished results out into the world (figures, tables, data, etc.) in a way that is discoverable and citable.  Importantly, much of this content would not be appropriate for Dryad, since it is not associated with (and not documented by) an authoritative publication.

There are clearly some challenges to the FigShare model.  A big one, shared with many other Open Science experiments that disseminate prior to peer review, is ensuring that there is adequate documentation for users to assess fitness for reuse.  Another challenge that Dryad is greatly concerned about is guaranteeing that the content will still be usable, and there will be the means to host it, ten or twenty years down the road.  These are reflections of larger unanswered questions about how the research community can best take advantage of the web for scholarly communication, and how to optimize filtering, curating or preserving such communications. To answer these questions, the world of open data needs many more more innovative projects like FigShare.

Considering FigShare’s relaunch suggests a few strengths of the Dryad model:

  • Dryad works with journals to integrate article and data submission, streamlining the deposit process.
  • Dryad curators review files for technical problems before they are released, and ensure that their metadata enables optimal retrieval.
  • Dryad’s scope is focused on data files associated with published articles in the biosciences (plus software scripts and other files important to the article.)
  • Dryad can make data securely available during peer review, at the request of the journal.
  • Dryad is community-led, with priorities and policies shaped by the members of the Dryad Consortium, including scientific societies, publishers, and other stakeholder organizations.
  • Dryad can be accessed programmatically through a sitemap or OAI-PMH interface.
  • Dryad content is searchable and replicated through the DataONE network, and it handshakes with other repositories to coordinate data submission.

For more about Dryad, browse the repository or see Why Should I Choose Dryad for My Data?

A file sharing platform and a data repository are different animals, to be sure; both have a place in a lively open data ecosystem. We wish success to the Digital Science team, and look forward to both working together, and challenging each other, to better meet the needs of the research community.  To see what other options are out there for different disciplines and types of data, DataCite provides an updated list of list of research data repositories.

Read Full Post »

Our last post celebrated the 1000th data package in Dryad. This week, with the release of two data packages associated with articles in Ecological Monographs, we celebrate another important milestone, our 100th journal.

We believe this validates one of the premises on which Dryad was founded, that a non-specialist data repository can serve as shared infrastructure for a large and diverse set of journals.  As a group, they have little in common, serving authors and readers from many different research communities, nationalities, types of institutional affiliation, etc., and working with many different kinds of data.  Some are owned by societies, some by commercial publishers, some by not-for-profits.  Some are Open Access, many are not.  Some have specialized disciplinary or taxonomic scope (e.g. including journals that publish on birds, herps, insects, mammals, plants, protists, viruses, etc.) while some publish findings from all corners of science (Nature, PNAS, Science).

Interestingly, this set of 100 is roughly five times the number of journals that have integrated manuscript submission with Dryad in order to facilitate authors’ data archiving.  While the integrated journals still account for the majority of new data submissions, we are pleased to continue receiving data volunteered by authors publishing in outlets new to Dryad.

The journals that have integrated their manuscript processing with Dryad to date are mostly, though not exclusively, from the fields of evolutionary biology and ecology:

  • The American Naturalist
  • Biological Journal of the Linnean Society
  • BMJ Open (an important first step in that it is our first integrated biomedical journal)
  • Ecological Monographs
  • Evolution
  • Evolutionary Applications
  • Heredity
  • Journal of Evolutionary Biology
  • Journal of Heredity
  • Molecular Ecology and Molecular Ecology Resources
  • Paleobiology
  • Pensoft Publishers – 8 different journals
  • Systematic Biology

But Dryad’s broadening disciplinary coverage is best illustrated by listing some of the journals with content in the repository that have not, at least not yet, implemented integrated submission:

  • Animal Behaviour
  • Bioinformatics
  • Biotropica
  • Conservation Genetics
  • Environmental Microbiology
  • Evolution and Development
  • Frontiers in Psychology
  • Genome Biology and Evolution
  • Human Genomics
  • Integrative and Comparative Biology
  • Journal of Biogeography
  • Journal of Fish and Wildlife Management
  • The Journal of Parasitology
  • Limnology and Oceanography
  • The Plant Cell
  • PLoS Pathogens
  • Symbiosis
  • Toxicon

And we are particularly pleased by the irony of hosting data from Genesis ;)

If you are an editor, publisher, or just a passionate reader of a journal that currently has content in Dryad (you can find out for yourself here), and you would like to talk about how manuscript submission integration could strengthen the service that Dryad provides to your journal, then please contact us.

Read Full Post »

Dryad has won high-level support from the UK Parliament. Its Select Committee on Science and Technology has been reporting on the peer review of scientific publications. Among the questions it considered was:  How far should reviewers be expected to go to assess technical soundness? The report discusses the feasibility of reviewing the underlying data behind research, and how those data should be managed.

Section 4 of the report (para 189) concludes:

If reviewers and editors are to assess whether authors of manuscripts are providing sufficient accompanying data, it is essential that they are given confidential access to relevant data associated with the work during the peer-review process. This can be problematical in the case of the large and complex datasets which are becoming increasingly common. The Dryad project is an initiative seeking to address this. If it proves successful, funding should be sought to expand it to other disciplines. Alternatively, we recommend that funders of research and publishers work together to develop similar repositories for other disciplines.

The Science and Technology Committee concludes that in order to allow others to repeat and build on experiments, researchers should aim for the gold standard of making their data fully disclosed and made publicly available:

Access to data is fundamental if researchers are to reproduce, verify and build on results that are reported in the literature. We welcome the Government’s recognition of the importance of openness and transparency. The presumption must be that, unless there is a strong reason otherwise, data should be fully disclosed and made publicly available. In line with this principle, where possible, data associated with all publicly funded research should be made widely and freely available. Funders of research must coordinate with publishers to ensure that researchers disclose their data in a timely manner. The work of researchers who expend time and effort adding value to their data, to make it usable by others, should be acknowledged as a valuable part of their role. Research funders and publishers should explore how researchers could be encouraged to add this value.

H.M.S.O. Science and Technology  Committee. Eighth Report: Peer review in scientific publications. Published 28 July 2011  Available at: http://www.publications.parliament.uk/pa/cm201012/cmselect/cmsctech/856/85602.htm

Read Full Post »

Dryad is pleased to welcome BMJ Open as a new partner journal, reflecting the recently expanded scope of repository to be inclusive of all of basic and applied biosciences, including medicine. BMJ Open is a new online-only, open access journal from the esteemed London-based BMJ Group.  It is dedicated to publishing medical research from all disciplines and therapeutic areas, utilizing fully open peer review and immediate online publication.

BMJ Open authors are now being strongly encouraged to deposit the data underlying their articles in Dryad or a more specialized repository, as appropriate.  Authors submitting articles to the journal will benefit from Dryad’s journal submission integration, the process by which data deposit is streamlined for authors through behind-the-scenes communication between the journal and the repository.

An extremely important issue with archiving medical data is, of course, the need to protect patient privacy. To assist its authors, BMJ Open is providing special guidance on data sharing.  Authors must be able to release data to the public domain as with all data in Dryad, and the repository will err on the side of caution by turning back any data that may compromise patient privacy.

To quote from the BMJ Group press release:

Data sharing aims to help scientists and doctors validate and scrutinise researchers’ findings in a bid to prevent fraud and eradicate the kind of selective reporting that has enabled some treatments to acquire regulatory approval, based on incomplete and biased data. In some cases this lack of transparency has prompted the subsequent restriction or withdrawal of certain treatments because of patient safety or effectiveness concerns, which were already evident in the unpublished data.  Data repositories also allow researchers to develop new methods of analysis and use the data to answer questions that the original researchers have not thought of. They also facilitate the acquisition of data for meta analysis (more in-depth comparative reviews).

Commenting on the move, Dr Trish Groves, editor in chief of BMJ Open, said: “Since launch, BMJ Open has championed transparency in medical research through open peer review, open access, and full reporting of studies’ methods and results, all exemplified by last week’s paper on the safety (or not) of medical devices (doi:10.5061/dryad.585t4)…”

This data package in Dryad, which illustrates the tremendous value of medical data for informing medical policy and practice without compromising patient privacy, is available at:

  • Heneghan C, Thompson M, Billingsley M, Cohen D (2011) Data from: Medical-device recalls in the UK and the device-regulation process: retrospective review of safety notices and alerts. Dryad Digital Repository. doi:10.5061/dryad.585t4

Groves goes on to say

We strongly encourage authors to share their datasets, and now we’re delighted to be making that easier to do, with the help of DryadUK.

Kudos to the Dryad UK project team, based at the British Library, for facilitating this pioneering partnership.

Read Full Post »

Dryad is happy to announce a new initiative with Pensoft Publishers, the pioneering publisher behind ZooKeys and other rapid-publication open access journals, including BioRisk, Comparative Cytogenetics, International Journal of Myriapodology, Journal of Hymenoptera Research, NeoBiota, PhytoKeys, and Subterranean Biology.  Dryad is working with Pensoft to support publication of data papers in the area of biodiversity, together with the Global Biodiversity Information Facility and the Barcode of Life.  Through this effort, we aim to make the data publishing experience as smooth and rewarding as possible for authors, while at the same time making sure these important data are vetted through peer review and available for reuse in public repositories.  The full press release from Pensoft is below.

Data publishing policies and guidelines for biodiversity data published by Pensoft

Pensoft Publishers announced a data publishing project for biodiversity data in response to the increasing demands from institutions and scientists to open scientific data to anyone who would be interested to use them.

“An opinion survey amongst the authors, readers and editors of the Pensoft journal ZooKeys carried out in April convinced us that the majority of participants (84 %) are willing to publish their data, so that to make them available to anyone to use, share or integrate with other data” said Dr Lyubomir Penev, managing director of Pensoft Publishers. Among the most important incentives to publish data, the scientists mentioned  that  “open data increases transparency and the overall quality of science, the potential for collaborative research as well as an opportunity to increase academic credit in the form of citations. Therefore, providing a service to ensure a permanent publication record for published data is of key importance for the success of the project”, adds Dr Penev.

The core of the project is the concept of the “Data Paper” developed in a cooperation with the Global Biodiversity Information Facility (GBIF). Data Papers are peer-reviewed scholarly publications that describe the published datasets and provide an opportunity to data authors to receive the academic credit for their efforts. Currently, Pensoft offers the opportunity to published Data papers describing biodiversity data, Barcode of Life genome data and biodiversity-related software tools, such as interactive keys and others.

Pensoft reached an agreement for cooperation in data hosting and developing of data publishing workflows with the GBIF, the Dryad Data Repository and the Consortium for Barcode of Life.

“Data publishing becomes increasingly important and already affects the policies of the world’s leading science funding frameworks and organizations. Opening and integrating biodiversity data will be the future basis to increase efficiency of monitoring the processes of global change, conservation of nature and saving life on our planet” concluded Dr Vincent Smith, coordinator of the European Union FP7 project ViBRANT, in the framework of which a part of the work has been carried out.

Read Full Post »

If you have recently published data in Dryad, chances are it was in the course of publishing an article at a partner journal that steered you our way.

But you may be aware that Dryad accepts data from any peer-reviewed article in biology or biomedicine.  That includes journals that are not (at least not yet) partners.  In fact, as of the the time of writing, Dryad has data associated with articles in 79 journals, approximately four times the number of partners.

Dryad even accepts data from articles that have already been published.  Now, why might you wish to go to the trouble of rummaging through those old files and putting your legacy data online?

Well, we noticed a while back that some individuals were beginning to do this systematically.  For example, there was a sudden influx of data packages with Frédéric Delsuc’s name on them a little while back.  Delsuc, of the French National Centre for Scientific Research (CNRS) and the Université Montpellier, is a member of an international team of collaborators (from France, Norway, Canada, Spain, Japan, Germany, Switzerland, and the United States) that has been using DNA sequence data to reconstruct the evolutionary history of a wide range of vertebrates and vertebrate relatives, from anteaters to sea squirts.

Giant Anteaters

Giant Anteaters (Myrmecophaga tridactyla). The pup clinging to his mother is Cyrano, who was born at the Smithsonian’s National Zoo in 2009. Photo credit: Mehgan Murphy, CC-BY-NC-ND, http://creativecommons.org/licenses/by-nc-nd/2.0/

So far, Delsuc and his team [1] have deposited data from 20 articles in Dryad. The articles are in partner journals such as Molecular Biology and Evolution, Molecular Phylogenetics and Evolution, Systematic Biology, as well as more general science journals such as Nature, Science, and the Proceedings of the National Academy of Sciences USA.

The articles stretch back to 2002, a time when most new desktop computers were still being outfitted with floppy drives. (Remember those?)

We asked Delsuc what he saw as the advantages to archiving his team’s heritage of legacy data?

We [...] decided in our team to try to systematically submit our datasets to Dryad because we really think they are valuable. Dryad offers a very nice way of archiving the data ensuring their durability over time.

For Delsuc and his team, no more rummaging through old storage devices to find the files when they receive an email request.  No more worrying about the data when  lab or departmental websites move.  They just need to point their colleagues to Dryad.

It has been reported that the number one reason cited when scientists are asked why they have denied their colleagues’ requests for data in the past was the amount of effort required to dig them up [2].  Delsuc’s and his team intuitively understood that, and went back to archive their data before memories faded, storage devices failed, and graduate students moved on.

The downside to archiving legacy data in this way is that an article’s readers won’t immediately know about the existence of the Dryad data package, since the data DOI will not be published within the text. So, while archiving legacy data has its advantages, there is no substitute for depositing the data before the article is published, as Dryad does with the new articles appearing in its partner journals.

To give Delsuc the final word:

It would be great if more and more journals in the field decide to include data deposit in their publication policies.

[1] Equipe Phylogénie et Evolution Moléculaire” (Phylogeny and Molecular Evolution team) of the Institut des Sciences de l’Evolution (Institute of Evolutionary Sciences), part of the CNRS: Centre National de la Recherche Scientifique (French National Centre for Scientific Research) and the Université Montpellier 2 (University of Montpellier 2).

[2] Campbell EG et al. (2002) Data Withholding in Academic Genetics: Evidence From a National Survey. JAMA 287(4):473-480. doi:10.1001/jama.287.4.473

Read Full Post »

Researchers working in data-intensive science, as well as science editors and publishers thinking about data policies, may want to take note of a new article by Michael Whitlock, Data archiving in ecology and evolution: best practices in the current issue of Trends in Ecology & Evolution.

Whitlock has long been a leader in advocating for data archiving and is the current Chair of the Dryad Consortium Board.  In this article he presents concrete suggestions for the what, how and when of data archiving.

But archiving is only half the equation.  Whitlock attempts to articulate sensible guidelines for data reuse, as well. Under what circumstances should researchers contact the original creators of the data set they are re-using, and when is co-authorship appropriate? How should authors properly acknowledge the original creators of the data?

Journals, editors, and publishers have an important role in promoting both data archiving and responsible data reuse.  One problem that merits broader discussion is how journals can conduct peer review so as to prevent data misuse.  Should researchers be given a chance to review manuscripts that report on new results reusing data that they originally published?  Or is it better to avoid the potential for conflict of interest (e.g. “how dare they not replicate my findings!”) and instead recruit independent experts?

Although the article is especially timely for those working in evolutionary biology and ecology, due to the recent adoption of mandatory data archiving at many of the leading journals in the field, these best practice recommendations are relevant across the sciences.

Michael C. Whitlock (2011) Data archiving in ecology and evolution: best practices, Trends in Ecology & Evolution,  26 (2): 61-65.  doi:10.1016/j.tree.2010.11.006.

Read Full Post »

« Newer Posts - Older Posts »

Follow

Get every new post delivered to your Inbox.

Join 6,212 other followers