Feeds:
Posts
Comments

Archive for the ‘Data availability’ Category

We’ve created a new Twitter feed for announcing all new data packages added to Dryad.  It’s @datadryadnew — follow it if you want to keep an eye on what is going in to the repository.

Our @datadryad feed is also available, for updates on the Dryad repository and data sharing in general.

Read Full Post »

BioMed Central, publisher of over 200 peer-reviewed journals, has issued a draft statement on data sharing and open data, inviting comments from the scientific community.  BMC’s Iain Hrynaszkiewicz consulted with several Dryad team members in the formulation of the statement.  A related editorial in BMC Research Notes names Dryad as an example of a repository where data are assigned a unique identifier and “available in perpetuity with permanence guaranteed.”  BMC Research Notes is seeking to encourage greater data sharing by waiving the publication fee for all articles which use or link to open data that is prepared in line with a community-accepted standard.

The draft statement supports data deposition in repositories assigning permanent identifiers to data, such as the DOI used by Dryad.    BMC endorses the publishers’ role of providing “clear and permanent links to data hosted in repositories” and are working on a list of the available repositories.

Furthermore the statement says  that “a way forward would be to require that from a specific date, any author submitting to a BioMed Central journal agrees to dedicate the data elements of their article and supplementary material to the public domain and apply the CC0 licence.” This proposed policy aligns closely with the Joint Data Archiving Policy (JDAP) already adopted by several Dryad partner journals.

Comments on the statement can be directed to the BMC blog.

Read Full Post »

Several journals in the field of proteomics have decided to mandate data sharing at the time of publication. These journals are leading the way toward data sharing out of a conviction that “the provenance of data sets and their proper citation is central to the research process,” as described in a recent commentary in Bio-IT World Share the Data: Making Large-Scale Proteomics Data Widely Available.

Mass Spectrometer, photo from U.S. Department of Energy Genome Programs

Now “authors who publish a manuscript containing mass spectrometry data in Molecular and Cellular Proteomics (MCP) must submit the raw data to a publicly accessible site.”   The journal Proteomics also requires data deposit in a public archive.

There are several specialized data repositories in the field, and several are working together as ProteomExchange “to provide a single point of submission to proteomics repositories, and encourage the data exchange and sharing of identifiers between the repositories so that the community may easily find datasets in the participating repositories.”

Read Full Post »

There are lots of opinions and answers to this question.  For starters, here’s a lively blog post, responding to this editorial last April.  Consider also this blog post.

What do you think are the barriers to data sharing?

Data from: Thompson S, Daniels K. 2010. A porous convection model for small-scale grass patterns. American Naturalist 175: E10-E15. Dryad Digital Repository. http://hdl.handle.net/10255/dryad.857

Read Full Post »

A new commentary piece, Linking big: the continuing promise of evolutionary synthesis,  in the journal Evolution describes the promise of “synthetic science,”  which includes re-use of data sets,  research results, or unconnected methods or concepts,  leading to new discoveries or trends.    The authors, who all are affiliated with the National Evolutionary Synthesis Center (NESCent),  argue for removing the cultural and technological barriers to enable new breakthroughs.

“By putting together pieces of prior research, it is possible to transform how you do science and open the doors to findings that previously were unattainable,” said Brian Sidlauskas, a fish biologist from Oregon State University and lead author on the Evolution article. “But such an approach runs counter to the way science traditionally has been conducted, so pursuing synthetic science is somewhat risky.”

“We need to reduce the risk, remove the barriers, and encourage more pursuit of synthesis because the potential,” he added, “is staggering.”

Sidlauskas cites access to actionable data as one of the major obstacles. “When you’re looking to synthesize data from several hundred individual studies, data formatting, storage and accessibility become huge issues,” he said.   He says that  “…the vast majority of data supporting previous studies are unavailable, often because the data are lost or preserved in inaccessible forms (notebooks, floppy disks).”

The article refers to Dryad as

… working to alleviate the problem of data availability by providing an open-access home for ecological and evolutionary data that does not fit into more specialized repositories. Dryad actively works with a coalition of journals and scientific societies to make deposition of all data a normal part of the research workflow. As more journals require data deposition as part of the manuscript publication process, the opportunities for potential syntheses linking such data will increase substantially.

Sidlauskas adds, “It’s kind of an open-source approach to science,” he added. “Data archives may require some kind of proprietary protection for a few months or years, but after a certain amount of time, they should become public domain. Only by saving the data that underlie today’s science will we allow future scientists to use those data in ways that may far exceed what the original researchers envisioned.”

Other authors on the commentary piece include Ganeshkumar Ganapathy, of the National Evolutionary Synthesis Center (NESCent); Einat Hazkani-Covo, Duke University Medical Center; Kristin P. Jenkins, NESCent; Hilmar Lapp, NESCent; Lauren W. McCall, NESCent; Samantha Price, University of California-Davis; Ryan Scherle, NESCent; Paula A. Spaeth, Northland College; and David M. Kidd, NERC Centre for Population Biology, Imperial College London.

CITATION: Sidlauskas, B., G. Ganapathy, et al. (2010). “Linking big: The continuing promise of evolutionary synthesis.” Evolution doi: 10.1111/j.1558-5646.2009.00892.x.

Read Full Post »

A strong editorial on data archiving is now available online in the February issue of The American Naturalist.

Authors Michael C. Whitlock, Mark A. McPeek, Mark D. Rausher, Loren Rieseberg, and Allen J. Moore present the case for the importance of data archiving in science.   This is the first of several coordinated editorials soon to appear in major journals:

To promote the preservation and fuller use of data, The American Naturalist, Evolution, the Journal of Evolutionary Biology, Molecular Ecology, Heredity, and other key journals in evolution and ecology will soon introduce a new data‐archiving policy. The policy has been enacted by the Executive Councils of the societies owning or sponsoring the journals.

Citation: Am Nat 2010. Vol. 175, pp. 145–146. DOI: 10.1086/650340

Read Full Post »

“Because the state of natural systems is never repeated, data losses, or missed data collection opportunities can never be corrected.”  So says the AGU, recently reaffirming the importance of data availability and preservation.earth-sunrise

The statement offers strong support for data archiving and publication as a routine part of the research process.

The cost of collecting, processing, validating, and submitting data to a recognized archive should be an integral part of research and operational programs. Such archives should be adequately supported with long-term funding. Organizations and individuals charged with coping with the explosive growth of Earth and space digital data sets should develop and offer tools to permit fast discovery and efficient extraction of online data, manually and automatically, thereby increasing their user base. The scientific community should recognize the professional value of such activities by endorsing the concept of publication of data, to be credited and cited like the products of any other scientific activity, and encouraging peer-review of such publications.

The full statement from the AGU Council can be found here.

Read Full Post »

A recent article, Motivating Online Publication of Data, in BioScience identifies multiple benefits for scientific authors when they publish data online.   Among them are:

•    additional publications
•    greater citation rate
•    invitations to collaborate

Author Mark Costello is a marine biologist at the University of Auckland, and has written widely on ocean biodiversity informatics.

Costello argues that Considering that science is based on observations, it is astonishing that the publication of primary data is not a universal and mandatory part of science.

He presents a cogent analysis of why data publication is crucial, how it can be encouraged, and what scientists, editors, and publishers must do to ensure access to data.   In addition to itemizing varied and far-reaching benefits to data archiving, he also repeats oft-stated reasons scientists have given for not making their data available, and rebuts them succinctly.

Authors are not the only beneficiaries when data is openly available.   Considerable benefits exist for editors, publishers, data centers and funding agencies, including:

•    independent verification of research findings
•    increased citations to related research papers
•    better financial return from research investment

Data sharing is fundamental for scientific advancement; no arguments there.  But how encourage data publication as a routine component of scientific research?  We need to identify the benefits, and ensure that repositories, publishers, and other participants in the research process pay attention to incentives, implicit and otherwise, throughout the publication cycle.

Costello’s article is a good place to start.

Read Full Post »

« Newer Posts

Follow

Get every new post delivered to your Inbox.

Join 2,435 other followers