How do researchers pay for data publishing? Results of a recent submitter survey

As a non-profit repository dependent on support from members and users, Dryad is greatly concerned with the economics and sustainability of data services. Our business model is built around Data Publishing Charges (DPCs), designed to recover the basic costs of curating and preserving data. Dryad DPCs can be covered in 3 ways:

  1. The DPC is waived if the submitter is based in a country classified by the World Bank as a low-income or lower-middle-income economy.
  2. For many journals, the society or publisher will sponsor the DPC on behalf of their authors (to see whether this applies, look up your journal).
  3. In the absence of a waiver or a sponsor, the DPC is US$120, payable by the submitter.

Our long-term aim is to increase sponsorships and reduce the financial responsibility of individual researchers.

Last year, we launched a pilot study sponsored by the US National Science Foundation to test the feasibility of having a funding agency directly sponsor the DPC. We conducted a survey of Dryad submitters as part of the pilot, hoping to learn more about how researchers plan and pay for data archiving.

Initial survey results

We first want to say a hearty THANK YOU to our participants for giving us so much good information to work with! (10 participants were randomly selected to receive gift cards as a sign of our appreciation). Respondents were located around the world, with nearly all based at academic institutions.

Survey respondents' positions

A word about selection of survey participants. We know that approximately 1/3 of all Dryad data publications do not have a sponsor or waiver, meaning the researcher is responsible for covering the $120 charge. We wanted to learn more about payment methods and funding sources for these non-sponsored DPCs.

We specifically solicited researchers for our survey who had 1) submitted to Dryad in the previous year and 2) paid their Data Publishing Charge directly (via credit card or voucher code). The survey questions focused on a few topics:

  • Grant funding and Data Management Plans
  • Where the money for their Data Publishing Charges ultimately came from, and
  • Whether funding concerns affect their data archiving behavior.

A few highlights are presented below; we intend to dig deeper into the survey results (and other information gathered as part of the pilot study) and report on them publicly in the coming months.

Planning for data in grant proposals

Nearly 72% of respondents indicated that the research associated with their publication/data was supported by a grant. We wanted to know how (or whether) researchers planned ahead for archiving their data in their grant proposals, and the results were enlightening:

  • 43% did not include a Data Management Plan (DMP) as part of their proposal for funding.
  • Of those who did submit a DMP, only about 46% committed to archiving their data as part of that plan.
  • A whopping 96% said they did not specifically budget for data archiving in their proposal.
  • Only 41% were able to archive their data within the grant funding period, while 59% were unable to, or were unsure.

As these results indicate, data management/stewardship is still not a high priority at the grant proposal stage. Even when researchers plan for data deposition, they don’t consider the costs associated. And even if they do (hypothetically) have funding specifically for data, the timing may not allow them to use it before the grant expires.

These factors suggest that if funding agencies want to prioritize supporting data stewardship, they should make funds available for this purpose outside the traditional grant structure.

Show me the money

When submitters pay the Dryad Data Publishing Charge themselves, where does that money come from? Are submitters being reimbursed? If so, how/by whom?

Our results showed that, unfortunately, about a quarter of our participants paid their DPCs out-of-pocket and did not receive any reimbursement. Approximately the same number paid themselves but were reimbursed (by their institution, a grant, or some combination of these), and 37% of DPCs were paid directly by the institution (using an institutional credit card or voucher code).

How was the Dryad DPC paid?

 

Some respondents view self-funding of data publication as worthwhile:

My belief is that scientific data should be publicly available and I am willing to cover the costs myself if supervisors (grant holders) do not.

As long as the cost is reasonable, in the worse case scenario I pay from my pocket. Better the data are safe and easily accessible for years to come than stored in spurious formats and difficult-to-access servers.

But for many others, covering the payment can be a real pain point:

I paid the processing charge myself mainly because our University’s reimbursement process was so laborious, I felt it easier just to get it over and done with myself and absorb the relatively small cost personally.

I just have to beg and plead for funding support each time.

If I am publishing after the postdoc ends then I am no longer paid to work on the project. Since I have had four postdocs, each lasting less than two years, this has happened for all my publications.

Examples from the “other” payment category shown above illustrate the scrappiness of researchers in finding funding:

I paid this from flexible research funds that were recently awarded by my institution. Had that not occurred, I would have had to pay personally and not be reimbursed.

I used my RTF (research trust fund) since I didn’t have dedicated grant funding.

Scavenged money from other projects.

Key takeaways

Our preliminary results show that at a time of more and stronger open data policies, paying for data publication remains far from straightforward, with much of the burden passed along to individual researchers.

Concerns about funding for open data can have real impacts on research availability and publication choice. More than 15% of our participants indicated that they have collected data in the last few years that they have been unable to archive due to lack of funds. Meanwhile, over 40% say that when choosing which journal(s) to submit to, sponsorship of the Dryad DPC does, or at least may, influence their decision.

The good news it that during our 8-month pilot implementation period, the US National Science foundation sponsored nearly 200 Data Publishing Charges for which researchers would otherwise have been responsible.

We at Dryad are committed to finding and implementing solutions, and very much appreciate the feedback and support we receive from the research and publishing community. Stay tuned for more lessons learned.