Feeds:
Posts
Comments

Archive for the ‘Policy’ Category

In 2011 Peggy Schaeffer penned an entry for this blog titled “Why does Dryad use CC0?” While 2011 seems like a long time ago, especially in our rapidly evolving digital world, the information in that piece is still as valid and relevant now as it was then. In fact, Dryad curators routinely direct authors to that blog entry to help them understand and resolve licensing issues. Since dealing with licensing matters can be confusing, it seems about time to revisit this briefly from a practical perspective.

Dryad uses Creative Commons Zero (CC0) to promote the reuse of data underlying scholarly literature. CC0 provides consistent, clear, and open terms of reuse for all data in our repository by allowing researchers, authors, and others to waive all copyright and related rights for a work and place the work in the public domain. Users know they can reuse any data available in Dryad with minimal impediments; authors gain the potential for more citations without having to spend time responding to requests from those wishing to use their data. In other words, CC0 helps eliminate the headaches associated with copyright and licensing issues for all stakeholders, leading to more data reuse.

So what does this mean in practical terms? Dryad’s curators have come up with a few suggestions to keep in mind as you prepare your data for submission. These tips can help you manage the CC0 requirements and avoid any problems:

DO:

  • Make sure any software included with your submission can be released under CC0. For example, licenses such as GPL or MIT are common and are not compatible with CC0. Be sure there are no licensing statements displayed in the software itself or in associated readme files.
  • Be aware that there are software applications out there that automatically place any output produced by the software under a non-CC0 compatible license. Consider this when you are deciding which software to use to prepare your data.
  • Know the terms of use for any information you get from a website or database.
  • Ensure that any images, videos, or other media that are not your own work can be released under CC0.
  • Be sure to clean up your data before submitting it, especially if you are compressing it using a tool such as zip or tar. Remove anything that can’t be released under CC0, along with any other extraneous materials, such as user manuals for hardware or software tools. Not only does removing extraneous files lessen the chance something will conflict with Dryad’s CC0 policy, it also makes your data more streamlined and easier to use.

DON’T:

  • Don’t add text anywhere in your data submission requiring permission or attribution for reuse. Community norms do a great job of putting in place the expectation that anyone reusing your data will provide the proper citations. CC0 actually encourages citation by keeping the process as simple as possible.
  • Don’t include your entire manuscript or parts of your manuscript in your data package. Most publications have licensing that restricts reuse and is not compatible with CC0.

I hope this post leaves you with a little more understanding about why Dryad uses CC0 and with a few tips that will help make following Dryad’s CC0 requirement easier.

 

Read Full Post »

As a non-profit repository dependent on support from members and users, Dryad is greatly concerned with the economics and sustainability of data services. Our business model is built around Data Publishing Charges (DPCs), designed to recover the basic costs of curating and preserving data. Dryad DPCs can be covered in 3 ways:

  1. The DPC is waived if the submitter is based in a country classified by the World Bank as a low-income or lower-middle-income economy.
  2. For many journals, the society or publisher will sponsor the DPC on behalf of their authors (to see whether this applies, look up your journal).
  3. In the absence of a waiver or a sponsor, the DPC is US$120, payable by the submitter.

Our long-term aim is to increase sponsorships and reduce the financial responsibility of individual researchers.

Last year, we launched a pilot study sponsored by the US National Science Foundation to test the feasibility of having a funding agency directly sponsor the DPC. We conducted a survey of Dryad submitters as part of the pilot, hoping to learn more about how researchers plan and pay for data archiving.

Initial survey results

We first want to say a hearty THANK YOU to our participants for giving us so much good information to work with! (10 participants were randomly selected to receive gift cards as a sign of our appreciation). Respondents were located around the world, with nearly all based at academic institutions.

Survey respondents' positions

A word about selection of survey participants. We know that approximately 1/3 of all Dryad data publications do not have a sponsor or waiver, meaning the researcher is responsible for covering the $120 charge. We wanted to learn more about payment methods and funding sources for these non-sponsored DPCs.

We specifically solicited researchers for our survey who had 1) submitted to Dryad in the previous year and 2) paid their Data Publishing Charge directly (via credit card or voucher code). The survey questions focused on a few topics:

  • Grant funding and Data Management Plans
  • Where the money for their Data Publishing Charges ultimately came from, and
  • Whether funding concerns affect their data archiving behavior.

A few highlights are presented below; we intend to dig deeper into the survey results (and other information gathered as part of the pilot study) and report on them publicly in the coming months.

Planning for data in grant proposals

Nearly 72% of respondents indicated that the research associated with their publication/data was supported by a grant. We wanted to know how (or whether) researchers planned ahead for archiving their data in their grant proposals, and the results were enlightening:

  • 43% did not include a Data Management Plan (DMP) as part of their proposal for funding.
  • Of those who did submit a DMP, only about 46% committed to archiving their data as part of that plan.
  • A whopping 96% said they did not specifically budget for data archiving in their proposal.
  • Only 41% were able to archive their data within the grant funding period, while 59% were unable to, or were unsure.

As these results indicate, data management/stewardship is still not a high priority at the grant proposal stage. Even when researchers plan for data deposition, they don’t consider the costs associated. And even if they do (hypothetically) have funding specifically for data, the timing may not allow them to use it before the grant expires.

These factors suggest that if funding agencies want to prioritize supporting data stewardship, they should make funds available for this purpose outside the traditional grant structure.

Show me the money

When submitters pay the Dryad Data Publishing Charge themselves, where does that money come from? Are submitters being reimbursed? If so, how/by whom?

Our results showed that, unfortunately, about a quarter of our participants paid their DPCs out-of-pocket and did not receive any reimbursement. Approximately the same number paid themselves but were reimbursed (by their institution, a grant, or some combination of these), and 37% of DPCs were paid directly by the institution (using an institutional credit card or voucher code).

How was the Dryad DPC paid?

 

Some respondents view self-funding of data publication as worthwhile:

My belief is that scientific data should be publicly available and I am willing to cover the costs myself if supervisors (grant holders) do not.

As long as the cost is reasonable, in the worse case scenario I pay from my pocket. Better the data are safe and easily accessible for years to come than stored in spurious formats and difficult-to-access servers.

But for many others, covering the payment can be a real pain point:

I paid the processing charge myself mainly because our University’s reimbursement process was so laborious, I felt it easier just to get it over and done with myself and absorb the relatively small cost personally.

I just have to beg and plead for funding support each time.

If I am publishing after the postdoc ends then I am no longer paid to work on the project. Since I have had four postdocs, each lasting less than two years, this has happened for all my publications.

Examples from the “other” payment category shown above illustrate the scrappiness of researchers in finding funding:

I paid this from flexible research funds that were recently awarded by my institution. Had that not occurred, I would have had to pay personally and not be reimbursed.

I used my RTF (research trust fund) since I didn’t have dedicated grant funding.

Scavenged money from other projects.

Key takeaways

Our preliminary results show that at a time of more and stronger open data policies, paying for data publication remains far from straightforward, with much of the burden passed along to individual researchers.

Concerns about funding for open data can have real impacts on research availability and publication choice. More than 15% of our participants indicated that they have collected data in the last few years that they have been unable to archive due to lack of funds. Meanwhile, over 40% say that when choosing which journal(s) to submit to, sponsorship of the Dryad DPC does, or at least may, influence their decision.

The good news it that during our 8-month pilot implementation period, the US National Science foundation sponsored nearly 200 Data Publishing Charges for which researchers would otherwise have been responsible.

We at Dryad are committed to finding and implementing solutions, and very much appreciate the feedback and support we receive from the research and publishing community. Stay tuned for more lessons learned.

Read Full Post »

We’re coming off of a big month which included a two-day Dryad board meeting, International Data Week in Denver, and the Open Access Publishers meeting (COASP) in Arlington, VA. Combined with Open Access Week, we’ve been basking in all things #openscience at Dryad.

International Data Week 2016

International Data Week was a collection of three different events: SciDataCon 2016International Data Forum, idwlogoand the 8th Research Data Alliance Plenary Meeting. While it was my first time attending RDA and SciDataCon, it wasn’t the first time for the many Dryad board members who have been actively participating in these forums for years.

Dryad staff had the pleasure of participating in a few panels over the week. As part of SciDataCon, Elizabeth Hull discussed protecting human subjects in an open data repository. In another, as part of the RDA 8th Plenary, I participated in a discussion of the challenges surrounding sustainability of data infrastructure. (The talk is available on the RDA website. The panel starts at minute 30).

29822088326_6d9db25bbf_qParticipating in IDW reminded me how important our diverse community of stakeholders and members are to furthering the adoption of open data. Dryad members create a community and support our mission. Our members benefit by receiving discounts on data publication fees and by relying on a repository that stays current in the evolving needs and mandates that surround open data. We work together to help make open data easy and affordable for authors.

Asking OA publishers to be more open

Following International Data Week, I had the opportunity to participate for the first time in the Open Access Scholarly Publishers Association meeting, COASP 2016. Heather Joseph, Executive Director of SPARC kicked off the meeting with a keynote that urged attendees to consider how they would complete the phrase “Open in order to . . .” as a way to ensure that we all keep our sights on working toward something more than just ‘open for the sake of open’. Some of other memorable talks addressed the challenges with mapping connections from articles to other related outputs, and discussed the growing interest in alternative revenue models to article processing charges (APCs). I had the privilege to deliver a keynote entitled “Be More Open” which highlighted the connections between Open Access and Open Data movement, and I encouraged OASPA to add open data policies to their membership requirements.

I’d like to thank the organizers and sponsors of International Data Week and COASP 2016 for making these important conversations possible. In addition, I would also like to encourage any interested stakeholders to join Dryad and support open data.

Read Full Post »

On May 24, we held the first virtual Dryad Community Meeting, which allowed us to connect both with our membership and with the larger open data community, far and wide. The theme was “Leadership in data publishing: Dryad and learned societies.”

Following an introduction and update about Dryad from yours truly, we heard about the experiences from representatives of three of Dryad’s member societies.

All three societies require that data be archived in an appropriate repository as a condition of publication in their journals. Yet, they have each taken considerable time and effort to develop policies that address the needs and concerns of their different communities.

Bruna spoke about working with an audience that routinely gathers data for very long-term studies. For many Biotropica authors, embargoes are seen as an important prerequisite for data publishing. Their data policy “includes a generous embargo period of up to three years to ensure authors have ample time to publish multiple papers from more complex or long-term data sets”. Biotropica’s policy also recommends those “who re-use archived data sets to include as fully engaged collaborators the scientists who originally collected them”. To address initial resistance to data archiving, and to build understanding and consensus, Biotropica “enlisted its critics” to contribute to a paper discussing the pros and cons of data publication. Out of this process emerged an innovative policy that went into effect at the start of 2016.

Meaden, by contrast, noted that only 8% of Proceedings B authors elect to embargo data in Dryad, and the standard embargo is for only one year after publication. She credited clearer author instructions and a data availability statement in the manuscript submission system as key elements that have increased the availability of data associated with Royal Society publications.

Newton discussed BES’ move from “encouraging data publication” in 2012 to requiring it in 2014. As shown below, this resulted in an impressive increase in the availability of data. Next, the society is looking to develop guidance on data reuse etiquette. Newton noted that this effort would “need to be community-led.”

BES_data_preservation

Slide from Erika Newton’s presentation, illustrating the rise in data deposits for BES journals as associated with changing data policy.

While each speaker reported on unique challenges, all shared commonalities, such as:

  • involving the specific community in policy decisions
  • incrementally increasing efforts to make data available
  • the importance of clear author instructions 

We greatly appreciate the excellent contributions from the panelists, as well all the members and other attendees who participated and contributed to the lively Q&A.

We are also pleased that the virtual format was well received. In our follow-up survey, many of the attendees said they found it easy to ask questions and appreciated the ability to join remotely.

Our aim is that these meetings continue to be a valued forum for our diverse community of stakeholders to share knowledge and discuss emerging issues. If you have suggestions on topics for future meetings, or an interest in becoming a member, please reach out to me at director@datadryad.org.

dryad_members

 

Read Full Post »

watering-can-simpler-2

Over the last few years, we’ve learned a lot about what is needed to curate, preserve, and provide access to data for the long term, as well as to sustain an independent not-for-profit organization. We’ve also paid close attention to the needs and wants of our user community and members. To meet these needs, we are revising our pricing structure for the first time since it was introduced in 2013.

  • Submissions initiated after 4 January 2016 will have a base Data Publication Charge (DPC) of $120US.
  • Pricing is now the same for all journals – there will no longer be an additional surcharge for non-integrated publications.
  • We encourage individuals and small groups to purchase bundles of DPC vouchers in advance and in any quantity. Purchases over 25 DPCs will enjoy a discount.
  • As a further user benefit, we will be doubling the maximum package size before overage fees kick in (to 20GB) and simplifying and reducing the overage fees.
  • We will continue to waive DPCs for researchers from World Bank low-income and low-middle-income economies upon request.
  • Membership fees are not changing, but Dryad members will be entitled to receive larger discounts on DPCs.
  • As always, there are no fees to download or reuse data from Dryad.
  • Integrating Dryad’s system with partner journals remains a free service.

Dryad’s Board of Directors will continue to keep a close eye on the repository’s sustainability progress. We anticipate this price structure will remain stable for the foreseeable future and are always seeking opportunities for savings and efficiencies.

We are grateful to our community supporters and take seriously the responsibility to ensure the long-term availability of the research data entrusted to us.

Prepaid data submission vouchers can be purchased at current pricing levels ($80 apiece) through January 4th (and at the new price of $120 apiece after that), by contacting help@datadryad.org.

Payment plans are either subscription or usage-based. Organizations and individuals may also make advance purchases of any number of DPCs and are eligible for bulk discounts for purchases of 25 or more.

What exactly do your DPCs cover?

The following breakdown of expenses reflects projected costs in the near future, extrapolating from historic growth rates. Approximately half of costs are associated with Repository Management, including membership-based nonprofit governance, communications with Dryad’s many stakeholders, members and partners, and upkeep of software systems (Repository Maintenance). Another quarter of the costs are due to the curation and user support provided to each data package, part of Dryad’s unique service offering and commitment to quality.

Since Dryad is a virtual organization, Infrastructure & Facilities largely covers server costs, digital storage, and interoperability technologies such as Digital Object identifiers (DOIs). A small fraction goes to community outreach activities to help encourage data publication best practices and raise awareness of Dryad. Administrative Support covers essential functions such as accounting and contract review.

Finally, Research and Development is essential for building new features to support changing technology and user expectations. R&D expenses are included here, but would ordinarily be covered through special project grants and not considered an operating expense paid for through DPCs.

We expect that as efficiencies are put into place, volume increases, and further economies of scale are realized, the percentage of the DPC supporting Repository Management will decrease and other areas, most notably Curation, will increase.

expense_breakdown-01

Read Full Post »

Dryad has been proud to support integrated data and manuscript submission for PLOS Biology since 2012, and for PLOS Genetics since 2013.  Yet there are over 400 data packages in Dryad from six difFeatured imageferent PLOS journals in addition to two research areas of PLOS Currents. Today, we are pleased to announce that we have expanded submission integration to cover all seven PLOS journals, including the two above plus PLOS Computational BiologyPLOS MedicinePLOS Neglected Tropical DiseasesPLOS ONE, and PLOS Pathogens.  

PLOS received a great deal of attention when they modified their Data Policy in March providing more guidance to authors on how and where to make their data available and introducing Data Availability Statements. Dryad’s integration process has been enhanced in a few ways to support this policy and also the needs of a megajournal like PLOS ONE.  We believe these modifications provide an attractive model for integration that other journals may wish to follow. The key difference for authors who wish to deposit data in Dryad is that you are now asked to deposit your data before submitting your manuscript.

  1. PLOS authors are now asked to provide a Data Availability Statement during initial manuscript submission, as shown in the screenshot below. There is evidence that introducing a Data Availability Statement greatly reinforces the effectiveness of a mandatory data archiving policy, and so we expect this change will substantially increase the availability of data for PLOS publications.  PLOS authors using Dryad are encouraged to provide the provisional Dryad DOI as part of the Data Availability Statement.
  2. PLOS authors are now also asked to provide a Data Review URL where reviewers can access the data, as shown in the second screenshot. While Dryad has offered secure, anonymous reviewer access for some time, the difference now is that PLOS authors using Dryad will be able to enter the Data Review URL  at the time of initial manuscript submission.
  3. In addition to these visible changes, we have also introduced an Application Programming Interface (API) to facilitate behind-the-scenes metadata exchange between the journal and the repository, making the process more reliable and scalable. This was critical for PLOS ONE, which published 31,500 articles in 2013.  Use of this API is now available as an integration option to all journals as an alternative to the existing email-based process, which we will continue to support.

PLOS Data Availability Statement interface

PLOS Data Review URL interface

The manuscript submission interface for PLOS now includes fields for a Data Availability Statement and a Data Review URL.

If you are planning to submit a manuscript but are unsure about the Dryad integration options or process for your journal, just consult this page. For all PLOS journals, the data are released by Dryad upon publication of the article.  Should the manuscript be rejected, the data files return to the author’s private workspace and the provisional DOI is not registered.  Authors are responsible for paying Data Publication Charges only if and when their manuscript is accepted.

Jennifer Lin from PLOS and Carly Strasser from the California Digital Library recently offered a set of community recommendations for ways that publishers could promote better access to research data:

  • Establish and enforce a mandatory data availability policy.
  • Contribute to establishing community standards for data management and sharing.
  • Contribute to establishing community standards for data preservation in trusted repositories.
  • Provide formal channels to share data.
  • Work with repositories to streamline data submission.
  • Require appropriate citation to all data associated with a publication—both produced and used.
  • Develop and report indicators that will support data as a first-class scholarly output.
  • Incentivize data sharing by promoting the value of data sharing.

Today’s expanded and enhanced integration with Dryad, which inaugurates the new Data Repository Integration Partner Program at PLOS, is an excellent illustration of how to put these recommendations into action.

Read Full Post »

Molecular Ecology cover imageWe are pleased to report that Molecular Ecology is now the first journal to surpass 1000 data packages in Dryad! Our latest featured data package is the one that took Molecular Ecology past the goalposts:

  • Bolnick D, Snowberg L, Caporaso G, Lauber C, Knight R, Stutz W (2014) Major Histocompatibility Complex class IIb polymorphism influences gut microbiota composition and diversity. Molecular Ecology doi:10.1111/mec.12846
  • Bolnick D, Snowberg L, Stutz W, Caporaso G, Lauber C, Knight R (2014) Data from: Major Histocompatibility Complex class IIb polymorphism influences gut microbiota composition and diversity. Dryad Digital Repository doi:10.5061/dryad.2s07s

Why so many data packages from Molecular Ecology?  It is likely due to a few factors.  One, Molecular Ecology publishes a lot of papers (445 in 2012 according to Journal Citation Reports) and have had integrated data and manuscript submission with Dryad since 2010.  Two, the field works with many datatypes for which no specialized repository exists.  Three, Molecular Ecology not only began requiring data archiving in 2011 when it adopted the Joint Data Archiving Policy, but actually goes beyond JDAP by requiring a completed data availability statement in each article, something that managing editor Tim Vines and his colleagues have shown to be associated with very high rates of data archiving. Four, since Dryad introduced Data Publishing Charges, Molecular Ecology has been sponsoring those charges on behalf of its authors.

Other journals looking to support data archiving in their fields would do well to look at Molecular Ecology as a model.

Read Full Post »

Older Posts »