Feeds:
Posts
Comments

Archive for the ‘Policy’ Category

On May 24, we held the first virtual Dryad Community Meeting, which allowed us to connect both with our membership and with the larger open data community, far and wide. The theme was “Leadership in data publishing: Dryad and learned societies.”

Following an introduction and update about Dryad from yours truly, we heard about the experiences from representatives of three of Dryad’s member societies.

All three societies require that data be archived in an appropriate repository as a condition of publication in their journals. Yet, they have each taken considerable time and effort to develop policies that address the needs and concerns of their different communities.

Bruna spoke about working with an audience that routinely gathers data for very long-term studies. For many Biotropica authors, embargoes are seen as an important prerequisite for data publishing. Their data policy “includes a generous embargo period of up to three years to ensure authors have ample time to publish multiple papers from more complex or long-term data sets”. Biotropica’s policy also recommends those “who re-use archived data sets to include as fully engaged collaborators the scientists who originally collected them”. To address initial resistance to data archiving, and to build understanding and consensus, Biotropica “enlisted its critics” to contribute to a paper discussing the pros and cons of data publication. Out of this process emerged an innovative policy that went into effect at the start of 2016.

Meaden, by contrast, noted that only 8% of Proceedings B authors elect to embargo data in Dryad, and the standard embargo is for only one year after publication. She credited clearer author instructions and a data availability statement in the manuscript submission system as key elements that have increased the availability of data associated with Royal Society publications.

Newton discussed BES’ move from “encouraging data publication” in 2012 to requiring it in 2014. As shown below, this resulted in an impressive increase in the availability of data. Next, the society is looking to develop guidance on data reuse etiquette. Newton noted that this effort would “need to be community-led.”

BES_data_preservation

Slide from Erika Newton’s presentation, illustrating the rise in data deposits for BES journals as associated with changing data policy.

While each speaker reported on unique challenges, all shared commonalities, such as:

  • involving the specific community in policy decisions
  • incrementally increasing efforts to make data available
  • the importance of clear author instructions 

We greatly appreciate the excellent contributions from the panelists, as well all the members and other attendees who participated and contributed to the lively Q&A.

We are also pleased that the virtual format was well received. In our follow-up survey, many of the attendees said they found it easy to ask questions and appreciated the ability to join remotely.

Our aim is that these meetings continue to be a valued forum for our diverse community of stakeholders to share knowledge and discuss emerging issues. If you have suggestions on topics for future meetings, or an interest in becoming a member, please reach out to me at director@datadryad.org.

dryad_members

 

Read Full Post »

watering-can-simpler-2

Over the last few years, we’ve learned a lot about what is needed to curate, preserve, and provide access to data for the long term, as well as to sustain an independent not-for-profit organization. We’ve also paid close attention to the needs and wants of our user community and members. To meet these needs, we are revising our pricing structure for the first time since it was introduced in 2013.

  • Submissions initiated after 4 January 2016 will have a base Data Publication Charge (DPC) of $120US.
  • Pricing is now the same for all journals – there will no longer be an additional surcharge for non-integrated publications.
  • We encourage individuals and small groups to purchase bundles of DPC vouchers in advance and in any quantity. Purchases over 25 DPCs will enjoy a discount.
  • As a further user benefit, we will be doubling the maximum package size before overage fees kick in (to 20GB) and simplifying and reducing the overage fees.
  • We will continue to waive DPCs for researchers from World Bank low-income and low-middle-income economies upon request.
  • Membership fees are not changing, but Dryad members will be entitled to receive larger discounts on DPCs.
  • As always, there are no fees to download or reuse data from Dryad.
  • Integrating Dryad’s system with partner journals remains a free service.

Dryad’s Board of Directors will continue to keep a close eye on the repository’s sustainability progress. We anticipate this price structure will remain stable for the foreseeable future and are always seeking opportunities for savings and efficiencies.

We are grateful to our community supporters and take seriously the responsibility to ensure the long-term availability of the research data entrusted to us.

Prepaid data submission vouchers can be purchased at current pricing levels ($80 apiece) through January 4th (and at the new price of $120 apiece after that), by contacting help@datadryad.org.

Payment plans are either subscription or usage-based. Organizations and individuals may also make advance purchases of any number of DPCs and are eligible for bulk discounts for purchases of 25 or more.

What exactly do your DPCs cover?

The following breakdown of expenses reflects projected costs in the near future, extrapolating from historic growth rates. Approximately half of costs are associated with Repository Management, including membership-based nonprofit governance, communications with Dryad’s many stakeholders, members and partners, and upkeep of software systems (Repository Maintenance). Another quarter of the costs are due to the curation and user support provided to each data package, part of Dryad’s unique service offering and commitment to quality.

Since Dryad is a virtual organization, Infrastructure & Facilities largely covers server costs, digital storage, and interoperability technologies such as Digital Object identifiers (DOIs). A small fraction goes to community outreach activities to help encourage data publication best practices and raise awareness of Dryad. Administrative Support covers essential functions such as accounting and contract review.

Finally, Research and Development is essential for building new features to support changing technology and user expectations. R&D expenses are included here, but would ordinarily be covered through special project grants and not considered an operating expense paid for through DPCs.

We expect that as efficiencies are put into place, volume increases, and further economies of scale are realized, the percentage of the DPC supporting Repository Management will decrease and other areas, most notably Curation, will increase.

expense_breakdown-01

Read Full Post »

Dryad has been proud to support integrated data and manuscript submission for PLOS Biology since 2012, and for PLOS Genetics since 2013.  Yet there are over 400 data packages in Dryad from six difFeatured imageferent PLOS journals in addition to two research areas of PLOS Currents. Today, we are pleased to announce that we have expanded submission integration to cover all seven PLOS journals, including the two above plus PLOS Computational BiologyPLOS MedicinePLOS Neglected Tropical DiseasesPLOS ONE, and PLOS Pathogens.  

PLOS received a great deal of attention when they modified their Data Policy in March providing more guidance to authors on how and where to make their data available and introducing Data Availability Statements. Dryad’s integration process has been enhanced in a few ways to support this policy and also the needs of a megajournal like PLOS ONE.  We believe these modifications provide an attractive model for integration that other journals may wish to follow. The key difference for authors who wish to deposit data in Dryad is that you are now asked to deposit your data before submitting your manuscript.

  1. PLOS authors are now asked to provide a Data Availability Statement during initial manuscript submission, as shown in the screenshot below. There is evidence that introducing a Data Availability Statement greatly reinforces the effectiveness of a mandatory data archiving policy, and so we expect this change will substantially increase the availability of data for PLOS publications.  PLOS authors using Dryad are encouraged to provide the provisional Dryad DOI as part of the Data Availability Statement.
  2. PLOS authors are now also asked to provide a Data Review URL where reviewers can access the data, as shown in the second screenshot. While Dryad has offered secure, anonymous reviewer access for some time, the difference now is that PLOS authors using Dryad will be able to enter the Data Review URL  at the time of initial manuscript submission.
  3. In addition to these visible changes, we have also introduced an Application Programming Interface (API) to facilitate behind-the-scenes metadata exchange between the journal and the repository, making the process more reliable and scalable. This was critical for PLOS ONE, which published 31,500 articles in 2013.  Use of this API is now available as an integration option to all journals as an alternative to the existing email-based process, which we will continue to support.

PLOS Data Availability Statement interface

PLOS Data Review URL interface

The manuscript submission interface for PLOS now includes fields for a Data Availability Statement and a Data Review URL.

If you are planning to submit a manuscript but are unsure about the Dryad integration options or process for your journal, just consult this page. For all PLOS journals, the data are released by Dryad upon publication of the article.  Should the manuscript be rejected, the data files return to the author’s private workspace and the provisional DOI is not registered.  Authors are responsible for paying Data Publication Charges only if and when their manuscript is accepted.

Jennifer Lin from PLOS and Carly Strasser from the California Digital Library recently offered a set of community recommendations for ways that publishers could promote better access to research data:

  • Establish and enforce a mandatory data availability policy.
  • Contribute to establishing community standards for data management and sharing.
  • Contribute to establishing community standards for data preservation in trusted repositories.
  • Provide formal channels to share data.
  • Work with repositories to streamline data submission.
  • Require appropriate citation to all data associated with a publication—both produced and used.
  • Develop and report indicators that will support data as a first-class scholarly output.
  • Incentivize data sharing by promoting the value of data sharing.

Today’s expanded and enhanced integration with Dryad, which inaugurates the new Data Repository Integration Partner Program at PLOS, is an excellent illustration of how to put these recommendations into action.

Read Full Post »

Molecular Ecology cover imageWe are pleased to report that Molecular Ecology is now the first journal to surpass 1000 data packages in Dryad! Our latest featured data package is the one that took Molecular Ecology past the goalposts:

  • Bolnick D, Snowberg L, Caporaso G, Lauber C, Knight R, Stutz W (2014) Major Histocompatibility Complex class IIb polymorphism influences gut microbiota composition and diversity. Molecular Ecology doi:10.1111/mec.12846
  • Bolnick D, Snowberg L, Stutz W, Caporaso G, Lauber C, Knight R (2014) Data from: Major Histocompatibility Complex class IIb polymorphism influences gut microbiota composition and diversity. Dryad Digital Repository doi:10.5061/dryad.2s07s

Why so many data packages from Molecular Ecology?  It is likely due to a few factors.  One, Molecular Ecology publishes a lot of papers (445 in 2012 according to Journal Citation Reports) and have had integrated data and manuscript submission with Dryad since 2010.  Two, the field works with many datatypes for which no specialized repository exists.  Three, Molecular Ecology not only began requiring data archiving in 2011 when it adopted the Joint Data Archiving Policy, but actually goes beyond JDAP by requiring a completed data availability statement in each article, something that managing editor Tim Vines and his colleagues have shown to be associated with very high rates of data archiving. Four, since Dryad introduced Data Publishing Charges, Molecular Ecology has been sponsoring those charges on behalf of its authors.

Other journals looking to support data archiving in their fields would do well to look at Molecular Ecology as a model.

Read Full Post »

BostonPanPlain2Updates: The originally scheduled keynote address from Phil Bourne will instead be a session on “The Future of Open Data – What to Expect from US Funders” with Jennie Larkin, Deputy Associate Director for Data Science at NIH and Peter McCartney, Program Director in the Division of Biological Infrastructure at NSF. Also, doors will open at 8:30 for a reception, at which light breakfast will be served.

We’re pleased to announce that our 2014 Community Meeting will be held on May 28 at the Institute for Quantitative Social Science at Harvard University.  This year’s meeting is being held jointly with the Dataverse Network Project, and the theme is Working Together on Data Discovery, Access and Reuse.

Many actors play a role in ensuring that research data is available for future knowledge discovery, including individual researchers, their institutions, publishers and funders. This joint community meeting will highlight existing solutions and emerging issues in the discovery, access and reuse of research data in the social and natural sciences.

Keynote speaker Dr. Phil Bourne is the first and newly appointed Associate Director for Data Science at the National Institutes of Health and a pioneer in furthering the free dissemination of science through new models of publishing. Prior to his NIH appointment, he was a Professor and Associate Vice Chancellor at the University of California San Diego.  He has over 300 papers and 5 books to his credit. Among his diverse contributions, he was the founding Editor-in-Chief of PLOS Computational Biology, has served as Associate Director of the RCSB Protein Data Bank, has launched four companies, most recently SciVee, and is a Past President of the International Society for Computational Biology. He is an elected fellow of the American Association for the Advancement of Science, the International Society for Computational Biology and the American Medical Informatics Association. Other honors he has received include the Benjamin Franklin Award in 2009 and the Jim Gray eScience Award in 2010.

The meeting will run from 8:30 9:00 am – 2:15 pm, including light breakfast and a catered lunch.  It will be followed by a Dryad Members Meeting, open to all attendees, from 2:30 – 3:30 pm.

There is no cost for registration, but space is limited. Onsite registration will be made available if space allows, and the proceedings will also be simulcast online.  Please see the meeting page for details.

This year’s Community Meeting has been scheduled for the convenience of those attending the Society for Scholarly Publishing Annual Meeting from May 28-30 in Boston.  SSP attendees may also wish to attend the session “The continuum from publishers to data repositories: models to support seamless scholarship”  May 29th from 10:45am-12:00pm.

For inquiries, please contact Laura Wendell (lwendell@datadryad.org) or Mercè Crosas (mcrosas@iq.harvard.edu).

Read Full Post »

The Data Citation Synthesis Group has released a draft Declaration of Data Citation Principles and invites comment.

This has been a very interesting and positive collaborative process and has involved a number of groups and committed individuals. Encouraging the practice of data citation, it seems to me, is one of the key steps towards giving research data its proper place in the literature.

As the preamble to the draft principles states:

Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record. In other words, data should be considered legitimate, citable products of research. Data citation, like the citation of other evidence and sources, is good research practice.

In support of this assertion, and to encourage good practice, we offer a set of guiding principles for data citation.

Please do comment on these principles. We hope that with community feedback and support, a finalised set of principles can be widely endorsed and adopted.

Discussion on a variety of lists is welcome, of course. However, if you want the Synthesis Group to take full account of your views, please be sure to post your comments on the discussion forum.

Some notes and observations on the background to these principles

I would like to add here some notes and observations on the genesis of these principles. As has been widely observed there have been a number of groups and interested parties involved in exploring the principles of data citation for a number of years. Mentioning only some of the sources and events that affected my own thinking on the matter, there was the 2007 Micah Altman and Gary King article, in DLib, which offered ‘A Proposed Standard for the Scholarly Citation of Quantitative Data’ and Toby Green’s OECD White Paper ‘We need publishing standards for datasets and data tables’ in 2009. Micah Altman and Mercè Crosas organised a workshop at Harvard in May 2011 on Data Citation Principles. Later the same year, the UK Digital Curation Centre published a guide to citing data in 2011.

The CODATA-ICSTI Task Group on Data Citation Standards and Practices (co-chaired by Christine Borgman, Jan Brase and Sara Callaghan) has been in existence since 2010. In collaboration with the US National CODATA Committee and the Board on Research Data and Information, a major workshop was organised in August 2011, which was reported in ‘For Attribution: Developing Data Attribution and Citation Practices and Standards’.

The CODATA-ICSTI Task Group then started work on a report covering data citation principles, eventually entitled ‘Out of Cite, Out of Mind’ – drafts were circulated for comment in April 2013 and the final report was released in September 2013.

Following the first ‘Beyond the PDF’ meeting in Jan 2011 participants produced the Force11 Manifesto ‘Improving Future Research Communication and e-Scholarship’ which places considerable weight on the availability of research data and the citation of those data in the literature. At ‘Beyond the PDF II’ in Amsterdam, March 2013, a group comprising Mercè Crosas, Todd Carpenter, David Shotton and Christine Borgman produced ‘The Amsterdam Manifesto on Data Citation Principles’. In the very same week, in Gothenburg, an RDA Birds of a Feather group was discussing the more specific problem of how to support, technologically, the reliable and efficient citation of dynamically changing or growing datasets and subsets thereof. And the broader issues of the place of data and research publication were being considered in the ICSU World Data Service Working Group on Data Publication. This group has, in turn, formed the basis for an RDA Interest Group.  Oooffff!

How great a thing is collaboration?

From June 2013, as the Force11 Group was preparing its website and activities to take forward the work on the Amsterdam Manifesto, calls came in from a number of sources for these various groups and initiatives to coordinate and collaborate. This was admirably well-received and from July the ‘Data Citation Synthesis Group’ had come into being with an agreed mission statement:

The data citation synthesis group is a cross-team committee leveraging the perspectives from the various existing initiatives working on data citation to produce a consolidated set of data citation principles (based on the Amsterdam Manifesto, the CODATA and other sets of principles provided by others) in order to encourage broad adoption of a consistent policy for data citation across disciplines and venues. The synthesis group will review existing efforts and make a set of recommendations that will be put up for endorsement by the organizations represented by this synthesis group.

The synthesis group will produce a set of principles, illustrated with working examples, and a plan for dissemination and distribution. This group will not be producing detailed specifications for implementation, nor focus on technologies or tools.

As has been noted elsewhere , the group comprised 40 individuals and brought together a large number of organisations and initiatives. What followed over the summer was a set of weekly calls to discuss and align the principles. I must say, I thought these were admirably organised and benefitted considerably from participants’ efforts to prepare documents comparing the various groups’ statements. The face-to-face meeting of the group, in which a lot of detailed discussion to finalise the draft was undertaken, was hosted (with a funding contribution from CODATA) at the US National Academies of Science between the 2nd RDA Plenary and the DataCite Summer Meeting (which CODATA also co-sponsored). It has been intellectually stimulating and a real pleasure to contribute to these discussions and to witness so many informed and engaged people bashing out these issues.

The principles developed by the Synthesis Group are now open for comment and I urge as many people, researchers, editors and publishers as possible who believe that data has a place in scholarly communications to comment on them and, in due course, to endorse them and put them into practice.

Are we finally at the cusp of real change in practice? Will we now start seeing the practice of citing data sources become more and more widespread? It’s soon to say for sure, but I hope these principles, and the work on which they build, have got us to a stage where we can start really believing the change is well underway.

Simon Hodson is Executive Director of CODATA and a member of the Dryad Board of Directors.  This post was originally published on the CODATA blog.

Read Full Post »

watering-can-simpler-2

As we announced earlier, Dryad will be introducing data publishing fees at the beginning of September. Here’s why we are doing this, and what it will mean for you as a submitter.

Why?

The Data Publishing Charge (DPC) is a modest fee that recovers the basic costs of curation and preservation, and allows Dryad to make its contents freely available for researchers and educators at any institution anywhere in the world.  DPCs provide a broad and fair revenue stream that scales with the costs of maintaining the repository, and helps ensure that Dryad can keep its commitment to long-term accessibility.

Who pays?

There are three cases:

  1. The DPC is waived in its entirety if the submitter is based in a country classified by the World Bank as a low-income or lower-middle-income economy.
  2. For many journals, the society or publisher will sponsor the DPC on behalf of their authors; you can see whether this applies to your journal here (the list is growing quickly, so be sure to check back when you are ready to submit new data).
  3. In the absence of a waiver or a sponsor, the DPC is US$80, payable by the submitter.  Payment details are accepted upon submission, but the fee will not be charged unless and until the data package is accepted for publication.

Two additional fees may apply. Submitters will be charged for data packages in excess of 10GB (US$15 for the first additional GB and US$10 for each GB thereafter), to cover additional storage costs.  If there is no sponsor, and the data package is associated with a journal lacking integrated data and manuscript submission, the submitter will be charged US$10 to cover the additional curation costs.

Submitters may use grant funds, institutional funds, or any other source, as long as payment can be made using a credit card or PayPal.  We regret that submitters cannot be invoiced for single submissions – but please do contact us if you are interested in purchasing a larger group of vouchers for future use.  We encourage researchers to inquire with librarians at their institution about available funding sources, and to budget data publication funds for future submissions into their grants, as part of their data management plan.

Note that there will be no charges for submissions made before the introduction of DPCs in September, regardless of when the data package is accepted for publication.

Help us spread the word

If your organization does not yet sponsor Data Publication Charges, or is not yet a Member, you may wish to let them know that you feel data archiving deserves their financial support.  Dryad offers a variety of flexible payment plans that provide for volume discounts, and there are additional discounts for Member organizations.  Organizations need not be publishers. Universities, funders, libraries and even individual research groups can purchase bundles of single-use vouchers that will cover the DPCs for data packages associated with publications appearing in any journal, as well as other publication types such as monographs and theses.  Prospective sponsors and Members may contact director@datadryad.org to figure out what will work best for their circumstances.

We are grateful for all the input we have received into our sustainability planning, and look forward to the continued support of our community in carrying out our nonprofit mission for many long years to come.  If you have questions or suggestions, please leave a comment or contact us here.

Read Full Post »

Older Posts »

Follow

Get every new post delivered to your Inbox.

Join 10,502 other followers