Feeds:
Posts
Comments

Posts Tagged ‘data archiving’

We are pleased to announce that Elementa is the latest journal to integrate submission of manuscripts with data to Dryad.  Elementa’s integration with Dryad means that all authors will be invited to archive the data supporting the conclusions in their article, and their process of depositing data files has been simplified by the behind-scenes-coordination between the journal and the repository. Authors will be invited to submit data to Dryad when their manuscript is accepted, and will have the option to set a one-year embargo on the availability of their data files.

The journal has a strong data policy, requiring “all major datasets associated with an article to be made freely and widely available.” The journal is also a Dryad member, and will be covering the charges for its authors when Dryad begins assessing Data Publishing Charges (DPC) on September 1.

Elementa: Science of the Anthropocene is a new open access scientific journal publishing original research reporting new knowledge of the Earth’s physical, chemical, and biological systems.

logo The journal is a nonprofit collaborative involving BioOne, Dartmouth, the Georgia Institute of Technology, the University of Colorado, the University of Michigan, and the University of Washington. Elementa is comprised of six inaugural knowledge domains: Atmospheric Science, Earth and Environmental Science, Ecology, Ocean Science, Sustainable Engineering, and Sustainability Transitions.

The journal is now welcoming article submissions, and the first articles will be published in September.

Read Full Post »

We are pleased to announce that Ecology Letters is the latest journal to integrate submission of manuscripts with data to Dryad.  In this process, the journal and repository communicate behind the scenes in order to streamline data submission for authors, and also to ensure that the article contains a permanent link to the data.

EcolLettCover copyEcology Letters is published by The French National Center for Scientific Research (CNRS), a public basic-research organization that defines its mission as producing knowledge and making it available to society. Marcel Holyaok, the journal’s Editor-in-Chief, has been actively involved with Dryad since 2009, serving on the Consortium Board from 2009-2011, and currently on the elected Board of Directors.

There are already a number of articles in Ecology Letters with associated data in Dryad, including the most frequently downloaded data file in Dryad, The Global Wood Density Database, which has been downloaded nearly 6000 times:

Zanne AE, Lopez-Gonzalez G, Coomes DA, Ilic J, Jansen S, Lewis SL, Miller RB, Swenson NG, Wiemann MC, Chave J (2009) Data from: Towards a worldwide wood economics spectrum. Dryad Digital Repository. doi:10.5061/dryad.234

Article:

Chave J, Coomes D, Jansen S, Lewis SL, Swenson NG, Zanne AE (2009) Towards a worldwide wood economics spectrum. Ecology Letters 12: 351-366. doi:10.1111/j.1461-0248.2009.01285.x

Dryad is delighted to welcome Ecology Letters to the growing group of journals that have taken this important step to support and facilitate their authors’ data archiving.

Read Full Post »

We are celebrating the recent publication in Dryad of the first data to accompany a book [1, 2]. Odd Couples: Extraordinary Differences Between the Sexes in the Animal Kingdom, from Princeton University Press, examines the occasionally surprising gender differences in animals, and what it means to be male or female in the animal kingdom. It is intended for both general and scientific readers.

The author, Daphne Fairbairn, a professor of biology at the University of California, Riverside, and Editor-in-Chief of Evolution, a Dryad partner journal, describes the data as:

…a survey of all recorded sexual dimorphisms in all of the animal classes that contain dioecious species (species with separate sexes).  It categorizes the prevalence of dioecy, the types of differences between the sexes (size, shape, color, etc.) and the magnitude of the differences.  I use this survey to construct frequency plots in the book, but there was no room to publish the full survey results.  This is the first time that such a survey has been done and I am hoping that it will prove useful to other biologists who might use the data for hypothesis testing.  I might even get around to this myself!

I think these archived data are one of the most significant contributions of the book to the scientific literature, even though they will not be important for non-specialist readers.

While most data in Dryad accompany journal articles, we are happy to see data archiving catching on with other types of publications such as books, thesis dissertations and conference proceedings.  Please contact us if you are interested in submitting data and have any questions about its suitability for Dryad.

[1] Fairbairn DJ (2013) Data from: Odd couples: extraordinary differences between the sexes in the animal kingdom. Dryad Digital Repository. doi:10.5061/dryad.n48cm

[2] Fairbairn DJ (2013) Odd Couples: Extraordinary Differences Between the Sexes in the Animal Kingdom, Princeton University Press, ISBN:9780691141961.

Read Full Post »

heatherMarch2013A study providing new insights into the citation boost from open data has been released in preprint form on PeerJ by Dryad researchers Heather Piwowar and Todd Vision. The researchers looked at thousands of papers reporting new microarray data and thousands of cited instances of data reuse. They found that the citation boost, while more modest than seen in earlier studies (overall, ~9%), was robust to confounding factors, distributed across many archived datasets, continued to grow for at least five years after publication, and was driven to a large extent by actual instances of data reuse. Furthermore, they found that the intensity of dataset reuse has been rising steadily since 2003.

Heather, a post-doc based in Vancouver, may be known to readers of this blog for her earlier work on data sharing, her blog, her role as cofounder of ImpactStory, or her work to promote access to the literature for text mining. Recently Tim Vines, managing editor of Molecular Ecology and a past member of Dryad’s Consortium Board, managed to pull Heather briefly away from her many projects to ask her about her background and latest passions:

TV: Your research focus over the last five years has been on data archiving and science publishing- how did your interest in this field develop?

HP: I wanted to reuse data.  My background is electrical engineering and digital signal processing: I worked for tech companies for 10 years. The most recent was a biotech developing predictive chemotherapy assays. Working there whetted my appetite for doing research, so I went back to school for my PhD to study personalized cancer therapy.

My plan was to use data that had already been collected, because I’d seen first-hand the time and expense that goes into collecting clinical trials data.  Before I began, though, I wanted to know if the stuff in NCBI’s databases was good quality, because highly selective journals like Nature often require data archiving, or was it instead mostly the dregs of research because that was all investigators were willing to part with.  I soon realized that no one knew… and that it was important, and we should find out.  Studying data archiving and reuse became my new PhD topic, and my research passion.

My first paper was rejected from a High Profile journal.  Next I submitted it to PLOS Biology. It was rejected from there too, but they mentioned they were starting this new thing called PLOS ONE.  I read up (it hadn’t published anything yet) and I liked the idea of reviewing only for scientific correctness.

I’ve become more and more of an advocate for all kinds of open science as I’ve run into barriers that prevented me from doing my best research.  The barriers kept surprising me. Really, other fields don’t have a PubMed? Really, there is no way to do text mining across all scientific literature?  Seriously, there is no way to query that citation data by DOI, or export it other than page by page in your webapp, and you won’t sell subscriptions to individuals?  For real, you won’t let me cite a URL?  In this day and age, you don’t value datasets as contributions in tenure decisions?  I’m working for change.

TV: You’ve been involved with a few of the key papers relating data archiving to subsequent citation rate. Could you give us a quick summary of what you’ve found?

HP: Our 2007 PLOS ONE paper was a small analysis related to one specific data type: human cancer gene expression microarray data.  About half of the 85 publications in my sample had made their data publicly available.  The papers with publicly available data received about 70% more citations than similar studies without available data.

I later discovered there had been an earlier study in the field of International Studies — it has the awesome title “Posting your data: will you be scooped or will you be famous?”  There have since been quite a few additional studies of this question, the vast majority finding a citation benefit for data archiving.  Have a look at (and contribute to!) this public Mendeley group initiated by Joss Winn.

There was a significant limitation to these early studies: they didn’t control for several of important confounders of citation rate (number of authors, of example).  Thanks to Angus Whyte at the Digital Curation Centre (DCC) for conversations on this topic.  Todd Vision and I have been working on a larger study of data citation and data reuse to address this, and understand deeper patterns of data reuse.  Our conclusions:

After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported.  We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data.  Other factors that may also contribute to the citation boost are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.

TV: Awareness of data archiving and its importance for the progress of science has increased massively over the past five years, but very few organizations have actually introduced mandatory archiving policies. What do you see as the remaining obstacles?

HP: Great question. I don’t know. Someone should do a study!  Several journals have told me it is simply not a high priority for them: it takes time to write and decide on a policy, and they don’t have time.  Perhaps wider awareness of the Joint Data Archiving Policy will help.

Some journals are afraid authors will choose a competitor journal if they impose additional requirements. I’m conducting a study to monitor the attitudes, experiences, and practices of authors in journals that have adopted JDAP policy and similar authors who publish elsewhere.  The study will run for 3 years, so although I have more than 2500 responses there is still another whole year of data collection to go.  Stay tuned :)

Keep an eye on Journal Research Data Policy Bank (JoRD) to stay current on journal policies for data archiving.

Funders, though.  Why aren’t more funders introducing mandatory public data archiving policies (with appropriate exceptions)?  I don’t know.  They should.  Several are taking steps towards it, but golly it is slow.  Is anyone thinking of the opportunity cost of moving this slowly?  More specific thoughts in my National Science Foundation RFI response with coauthor Todd Vision.

TV: You’re a big advocate of ‘open notebook’ science. How did you first get interested in working in this way?

HP: I was a grad student, hungry for information.  I wanted to know if everyone’s science looked like my science.  Was it messy in the same ways?  What processes did they have that I could learn from?  What were they are excited about *now* — findings and ideas that wouldn’t hit journal pages for months or years?

This was the same time that Jean-Claude Bradley was starting to talk about open notebook science in his chemistry lab.  I was part of the blogosphere conversations, and had a fun ISMB 2007 going around to all the publisher booths asking about their policies on publishing results that had previously appeared on blogs and wikis (my blog posts from the time; for a current resource see the list of journal responses maintained by F1000 Posters).

TV: It’s clearly a good way to work for people whose work is mainly analysis of data, but how can the open notebook approach be adapted to researchers who work at the bench or in the field?

HP: Jean-Claude Bradley has shown it can work well very in a chemistry lab.  I haven’t worked in the field, so I don’t want to presume to know what is possible or easy: guessing in many cases it wouldn’t be easy.  That said, more often than not, where there is a will there is a way!

TV: Given the growing concerns over the validity of the results in scientific papers, do you think that external supervision of scientists (i.e. mandated open notebook science) would ever become a reality?

HP: I’m not sure.  Such a policy may well have disadvantages that outweigh its advantages.  It does sound like a good opportunity to do some research, doesn’t it?  A few grant programs could have a precondition that the awardees be randomized to different reporting requirements, then we monitor and see what happens. Granting agencies ought to be doing A LOT MORE EXPERIMENTING to learn the implications of their policies, followed by quick and open dissemination of the results of the experiments, and refinements in policies to reflect this growing evidence-base.

TV: You’re involved in a lot of initiatives at the moment. Which ones are most exciting for you? 

HP: ImpactStory.  The previous generation of tools for discovering the impact of research are simply not good enough.  We need ways to discover citations to datasets, in citation lists and elsewhere.  Ways to find blog posts written about research papers — and whether those blog posts, in turn, inspire conversation and new thinking.  We need ways to find out which research is being bookmarked, read, and thought about even if that background learning doesn’t lead to citations.  Research impact isn’t the one dimensional winners-and-losers situation we have now with our single-minded reliance on citation counts: it is multi-dimensional — research has an impact flavour, not an impact number.

Metrics data locked behind subscription paywalls might have made sense years ago, when gathering citation data required a team of people typing in citation lists.  That isn’t the world we live in any more: keeping our evaluation and discovery metrics locked behind subscription paywalls is simply neither necessary nor acceptable.  Tools need to be open, provide provenance and context, and support a broad range of research products.

We’re realizing this future through ImpactStory: a nonprofit organization dedicated to telling the story of our research impact.  Researchers can build a CV that includes citations and altmetrics for their papers, datasets, software, and slides: embedding altmetrics on a CV is a powerful agent of change for scholars and scholarship.  ImpactStory is co-founded by me and Jason Priem, funded by the Alfred P. Sloan Foundation while we become self-sustaining, and is committed to building a future that is good for scholarship.  Check it out! and contact if you want to learn more: team@impactstory.org

Thanks for the great questions, Tim!

Read Full Post »

We are pleased to announce that Biology Letters is the latest journal to integrate submission of manuscripts with data to Dryad.  In this process, the journal and repository communicate behind the scenes in order to streamline data submission for authors and ensure that the article contains a permanent link to the data.

It is particularly apt because Biology Letters is published by the Royal Society, which invented the idea of sharing knowledge through a scientific journal back in 1665.  Scientific communication has come a long way from those early letters among gentleman natural philosophers to the current conception of Science as an Open Enterprise conducted in the public interest.  Reflecting these changes in science and technology, the Royal Society recently strengthened its policy on the availability of research data:

To allow others to verify and build on the work published in Royal Society journals it is a condition of publication that authors make available the data and research materials supporting the results in the article.

Datasets should be deposited in an appropriate, recognized repository and the associated accession number, link or DOI to the datasets must be included in the methods section of the article. Reference(s) to datasets should also be included in the reference list of the article with DOIs (where available). Where no discipline-specific data repository exists authors should deposit their datasets in a general repository such as Dryad.

There are already a healthy number of articles in Biology Letters with associated data in Dryad, including one of last year’s hit data packages, Monsters are people too.  The first to be published via integrated submission is:

Article:

Jevanandam N, Goh AGR, Corlett RT (2013) Climate warming and the potential extinction of fig wasps, the obligate pollinators of figs. Biology Letters 9(3): 20130041. doi:10.1098/rsbl.2013.0041

Data:

Goh AGR, Corlett RT, Jevanandam N (2013) Data from: Climate warming and the potential extinction of fig wasps, the obligate pollinators of figs. Dryad Digital Repository. doi:10.5061/dryad.hj7h2

Read Full Post »

Photo by DAVID ILIFF. License: CC-BY-SA 3.0

Mark Your Calendar!

The 2013 Dryad Membership Meeting

St Anne’s College, Oxford, UK

24 May 2013


The Dryad Membership Meeting will cap off a series of separate but related events spotlighting trends in scholarly communication and research data.  Highlights include:

  • A data publishing symposium on May 22 – Featuring new initiatives and current issues in data publishing (open to the public, nominal registration fee may apply).
  • A Joint Dryad-ORCID Symposium on Research Attribution on May 23 - On the changing culture and technology of how credit is assigned and tracked for data, software, and other research outputs (Public).
  • Dryad Membership Meeting on May 24 - Help chart the course for the organization’s future (Dryad Members only).

More details to be announced soon.

Read Full Post »

Our guest post today is from Mohamed Noor of Duke University, president of the American Genetic Association. The AGA is a scholarly society dating back to 1903.  AGA, together with Oxford University Press, publishes the Journal of Heredity, which is a charter member in the Dryad organization and one of the first journals to integrate manuscript and data submission with the repository.  The society just held their annual symposium in Durham, North Carolina, not so far from Dryad’s NESCent headquarters, and has some excellent news to report from the Council meeting.

The American Genetic Association is pleased to announce that it has now fully adopted the Joint Data Archiving Policy (JDAP) for the Journal of Heredity.  The Journal of Heredity had previously required that newly reported nucleotide or amino acid sequences, and structural coordinates, be submitted to appropriate public databases. For other forms of data, the Journal “endorsed the principles of the Joint Data Archiving Policy (JDAP) in encouraging all authors to archive primary datasets in an appropriate public archive, such as Dryad, TreeBASE, or the Knowledge Network for Biocomplexity.”

This voluntary archiving policy was facilitated by the direct link between the Journal of Heredity and Dryad, in effect since February 2010.

To further support data-sharing and data access, in July 2012, the AGA Council voted unanimously to make data archiving a requirement for publication, under the terms specified in the JDAP.

The requirement will take effect by January 1, 2013. The American Genetic Association also recognizes the vast investment of individual researchers in generating and curating large datasets. Consequently, we recommend that this investment be respected in secondary analyses or meta-analyses in a gracious collaborative spirit.

Many other leading journals in ecology and evolutionary biology have adopted policies modeled on JDAP over the past two years, and other journals are invited to consider it as a policy that has attracted wide support among scientists.

Read Full Post »

Dryad is delighted to join with PLOS today to announce our partnership with PLOS Biologyas described here on the official PLOS Biology blog, Biologue.  As the first Public Library of Science (PLOS) journal to partner with Dryad to integrate manuscript submission, “PLOS Biology can offer authors a seamless tying together of an article with its underlying data; [and] can also provide confidential access for editors and reviewers to data associated with articles under review.”
PLoS Biology - www.plosbiology.org

Here’s how it works: During manuscript evaluation, PLOS Biology invites authors to deposit the underlying data files in Dryad, sending them a link to Dryad which enables a streamlined upload process (no need to enter the article details).  Authors may deposit complex and varied data types in multiple formats, and these files are then accessible to editors and reviewers by anonymous and secure access during the manuscript review process.  Behind the scenes, the journal’s editorial system and the Dryad repository exchange metadata, ensuring that upon publication, the article links to the associated data in Dryad, and permanently connecting the published article with its securely archived, publicly available data.

Dr. Theodora Bloom, Chief Editor, PLOS Biology, mentions that journals “are uniquely well-placed to help researchers ensure that all data underlying a study are made available alongside any published articles.”

We welcome PLOS Biology authors and editors to Dryad, and look forward to extending this partnership to other PLOS journals.

Read Full Post »

Christopher Pirrone excavating an odontocete skull (photo by Robert Boessenecker)

Perhaps it’s understandable that paleontologists are committed to preserving the scientific record, since they spend a lot of time and energy finding and extracting shreds of evidence millions of years old.  Now, thanks to a partnership between Dryad and The Paleontological Society announced last year [1], coupled with strong data archiving policies adopted by two of its journals (Paleobiology and the Journal of Paleontology), a rich trove of data will be available for future researchers to unearth from Dryad.

For both journals, authors are being instructed to deposit the underlying data at the time their manuscript is submitted, so that editors and referees will be able to review it prior to acceptance.  Once published on Dryad, the data will be independently discoverable and citable, while at the same time prominently linked both to and from the original article.  Researchers are able to track the reuse impact of their data, independent of the citation impact of their article, by monitoring downloads from Dryad.

Preserved for ages.

Smilodon, by Charles Knight (1905), from a mural at the American Museum of Natural History.

Here’s an example from a recent issue of Paleobiology to sink your teeth into:

Article: Meachen-Samuels JA (2012) Morphological convergence of the prey-killing arsenal of sabertooth predators. Paleobiology 38(1): 1-14. doi:10.1666/10036.1

Data: Meachen-Samuels JA (2012) Data from: Morphological convergence of the prey-killing arsenal of sabertooth predators. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.h58q6

References:

[1]  Callaway E (2011) Fossil data enter the web period. Nature 472, 150. http://dx.doi.org/10.1038/472150a

Read Full Post »

Scaling up. Courtesy of Swamibu via flickr, CC-BY-NC

The US National Science Foundation, through its Advances in Biological Informatics program, has announced a new award of $2.4M over four years to Duke University (NESCent), the University of North Carolina Chapel Hill (Metadata Research Center), and North Carolina State University (Digital Library).

The award will enable Dryad to scale up its technical infrastructure to support the rapidly expanding user base of journals and researchers, ensure that the repository is meeting the needs of that user base, and to complete the transition to a financially independent non-profit organization.

This is one of a new breed of Development Awards being made by ABI, in which the review criteria judge the ability of the project to produce “robust, broadly-adopted cyberinfrastructure” with an emphasis on “user engagement, design quality, engineering practices, management plan, and dissemination”.

Repositories such as Dryad enable researchers to comply with funding agency expectations for long-term data preservation and availability, and we are grateful to NSF for its continuing support of this mission.

Read Full Post »

Older Posts »

Follow

Get every new post delivered to your Inbox.

Join 6,212 other followers