Five-ish Minutes With: Charles Fox

In our latest post, our Executive Director Melissanne Scheld sits down with Dryad’s Board of Directors Chair, Professor Charles Fox, to discuss challenges researchers face today, how Dryad is helping alleviate some of those pain points, why Dryad has had such staying power in a quickly changing industry,  . . . and then we move on to dessert. 

Chuck Fox

Can you tell us a little about your professional background and how that intersects with Dryad’s mission?

I wear two hats in my professional life – I am an evolutionary ecologist who studies various aspects of insect biology at the University of Kentucky, and I am a journal editor (Executive Editor of Functional Ecology).

My involvement with open data and Dryad began fortuitously in 2006. The British Ecological Society was invited to send a representative to a Data Registry Workshop, organized by the Ecological Society of America, to be held that December in Santa Barbara, California. I am (and was at that time) an editor of one of the British Ecological Society’s journals, Functional Ecology, and I live in the U.S. So Lindsay Haddon, who was Publications Manager for the BES, asked me to attend the workshop  as their representative. Before that meeting I don’t recall having thought much about open data or data archives, but I was excited to attend the meeting in part because the topic intrigued me and, selfishly, because my parents live in southern California and this was an opportunity to visit them. The discussions at that meeting, plus those at a couple follow-up meetings over the next couple years, including one at NESCent in Durham, North Carolina, and another in Vancouver, convinced me that data publishing, and open data more generally, should be a part of research publication. So I began lobbying the BES to adopt an open data policy and become a founding member of Dryad. I wrote a proposed data policy – just a revision of the Journal Sata Archiving Policy, JDAP, that many ecology and evolution journals adopted – and submitted that proposal to the BES’ publication committee. It took a few years, but in 2011 the BES adopted that data policy across their suite of journals and became a member of Dryad. The BES has since been a strong supporter of open data and required data publication as a condition of publishing a manuscript in one of their journals. Probably because I was a vocal proponent of data policies at BES meetings (along with a few others, most notably Tim Coulson), I was nominated to be a Dryad board member, and was elected to the board in 2013.

As an educator,  what are some of the biggest changes you’ve seen in the classroom during your career?

When I started teaching, first as a graduate student (teaching assistant) and then as a young university professor, we didn’t have Powerpoint and digital projectors. So I made heavy use of a chalkboard (or dry erase board) during lecture, and used an overhead projector for more complicated graphics. Students had to take detailed notes on the lecture, which required them to write furiously all throughout the class. Nowadays I produce detailed PowerPoint slides that include most of the material I cover, so I write very little on the chalkboard. And, because I can provide my slides to students before class – as a pdf that they can print and bring to class – the students are freed from scribbling furiously to capture every detail. Students still need to take some notes (my slides do not include every detail), but they are largely freed to listen to lecture and participate in class discussions. I am not convinced, though, that these changes have led to improved learning, at least not in all students. Having information too easily available, including downloadable class materials, seems to cause some students to actually disengage from class, and ultimately do poorly, possibly because they think they don’t need to attend class, or engage when they do attend, since they have all of the materials easily accessible to them outside the classroom?

What do you think the biggest challenges are for open science research today?

I have been amazed at how quickly open data has become accepted as the standard in the ecology and evolution research communities. When data policies were first proposed to journals there was substantial resistance to their adoption – journals were nervous about possibly driving away authors, and editors (who are also researchers) shared the views that were common in the community regarding ownership of their own data – but over just a few years the resistance largely disappeared among editors, societies and publishers, such that a large proportion of the top journals in the field have adopted policies requiring data to be published alongside research manuscripts. That said, some significant challenges remain, both on the researcher side and on the repository side. On the repository side, sustainable funding remains the largest hurdle. Data repositories cost money to run, such as for staff and infrastructure. Dryad has been relying on a mix of data publication charges (DPCs) and grants to fund its mission. This has worked for us so far, but constantly chasing grants is a lot of work for those writing grants, and the cost to researchers paying DPCs, albeit small, is not trivial for those without grant support.

On the researcher side, though data publishing has mostly become an accepted part of research publication in the community, there remain many important cultural and practical challenges to making open data universally practiced.  These include the development of standards for data citation and reuse (not restrictions on data reuse, but community expectations for citation and collaboration), balancing views of data ownership with the needs of the community, balancing the concerns of researchers that produce long-term datasets with those of the community, and others. We also need to improve education about data, such as teaching our students how to organize and properly annotate their datasets so that they are useful for other researchers after publication. Even when data are made available by researchers, actually using those data can be challenging if they are not well organized and annotated.

When researchers are deciding in which repository to deposit their research data, what values and functions should they consider?

Researchers should choose a repository that best fits the type of data they have to deposit and the community that will likely be reusing it. There are many repositories that handle specialized data types, such as genetic sequence data or data to be used for phylogenetic analysis. If your data suits a specialized archive, choose that. But the overwhelming majority of data generated by ecologists don’t fit into specialized archives. It’s for these types of data that Dryad was developed.

So what does Dryad offer researchers? From the perspective of the dataset author, Dryad links your dataset directly to the manuscript you have published about the dataset. This provides users detailed metadata on the contents of your dataset, helping them understand the dataset and use it correctly for future research. Dryad also ensures that your dataset is discoverable, whether you start at the journal page, on Dryad’s site, or any of a large number of collaborator services. The value of Dryad to the dataset user are similar – easy discoverability of data and clear links to the data collection details (i.e., links to the associated manuscripts).  

You’ve held several roles on Dryad’s Board of Directors – what about this organization compels you to volunteer your free time?

My experiences as a scientist, a journal editor, and participating in open data discussions have convinced me that data publication is an essential part of research publication. For decades, or even centuries, we’ve relied on a publishing model where researchers write manuscripts that describe the work they have done and summarize their results and conclusions for the broader community. That’s the typical journal paper, and was the limit of what could be done in an age where everything had to fit onto the printed page and be distributed on paper. Nowadays we have near infinite space in a digital medium to not just summarize our results, but also provide all of the details, including the actual data, as part of the research presentation. It will always be important to have an author summarize their findings and place their work into context – that intellectual contribution is an essential part of communicating your research – but there’s no reason that’s where we need to stop. I imagine a world where a reader can click on a figure, or table, or other part of a manuscript and be taken directly to the relevant details – the actual data presented in the figure, the statistical models underlying the analyses, more detailed descriptions of study sites or organisms, and possibly many other types of information about the experiment, data collection, equipment used, results, etc. We shouldn’t be constrained by historical limitations of the printed page. We’re not yet even close to where I think we can and should be  going, but making data an integral part of research publication is a huge step in the right direction. So I enthusiastically support journal mandates that require data to be published alongside each manuscript presenting research results. And facilitating this is a core part of Dryad’s mission, which leads me to enthusiastically support both Dryad’s mission and the organization itself!

Pumpkin or apple pie?  

Those are my two favorite pies, so it’s a tough question. If served a la mode, i.e., with ice cream, then I’d most often pick apple pie. But, without ice cream, I’d have to choose pumpkin pie.

Stay tuned for future conversations with industry thought leaders and other relevant blog posts here at Dryad News and Views.

 

The way forward at Dryad

Crossroads

Melissanne Scheld, Executive Director, takes time to reflect on the Dryad/CDL partnership and to share thoughts on the direction of this collaborative effort.

It’s been a fast two months since I joined Dryad at this pivotal and exciting juncture. As previously announced, this spring Dryad entered into a formal partnership with California Digital Library (CDL) to ensure long-term sustainability for Dryad and to reinforce two essential,  shared goals:

  1. Create sustainability for open-source, community-owned, data curation & publication infrastructure
  2. Drive adoption of curated data publishing in the research community.

Where we are

For the past decade, Dryad has served as a highly regarded, non-profit, curated repository for data research across disciplines. None of that is changing!

Going forward we need to better meet researchers within their own workflows. We need to make the action of submitting research data even easier so that it becomes a seamless step within the publishing process.

We are currently working to migrate the Dryad system onto CDL’s Dash platform. Using an Agile framework, developers from both Dryad and CDL are collaborating to build an open-source, nimble service that will offer a higher level of administrative functionality, an improved curation layer, and various submission options.

Where we’re going

Screen Shot 2018-10-22 at 4.28.18 PM

Researchers will find our new offering continues to meet funder requirements and sets the bar in best practices for data sharing. Using the FAIR data principles as a guide, the curation we perform on each dataset deposited eases findability and usability, while the new levels of enhanced integrations we plan to develop (more on this below) will further improve submitters’ workflows.

For institutions, we want to offer an infrastructure that supports local research data management through features including campus single sign-on, bespoke reporting, integration with local repositories, and campus co-branding. The global network of libraries, which CDL is part of, will help us reach a wider range institutions that are also looking for data management solutions.

Dryad has always had strong publisher support; our new offering will improve these partnerships through enhanced API integrations. Going forward we will build upon our publishing partners while also working with platform providers to develop direct integrations. This will provide a more automated submission process around the transmission of metadata and DOIs.

We want to build modular infrastructure that is future-proof. We should be thinking about data publishing both as its own entity and in conjunction with article publishing. There are many avenues for circulating research and data publishing should be a part of all of these. Publishing data should be as ‘easy’ and ‘standardized‘ as article publishing.

Along with more robust infrastructure, we need to rethink how we build Dryad’s sustainability.  As a small, lean, non-profit, we need to build financial models that don’t overburden any single segment of our community, but still allow us to support the high level of curation and preservation infrastructure for which Dryad is known.

We are currently market testing new models within our community and have been talking with institutions and publishers to hear how we can best support their data publishing needs and what shared costs might look like. We know that there has been a lot of talk lately in our wider community about membership models; early feedback from our partners indicates this is still the most favorable method for investing in long-term sustainability.

What will success look like for us?  

successThe Dryad/CDL partnership aims to create a self-sustaining, curated, digital data repository for researchers across all fields of inquiry, based on the needs of and supported by institutional and publisher community members. We are building from a strong foundation, have created a thoughtful roadmap through community feedback, and are confident we are on a pathway to sustainability.

Personally, I’m very excited about all of these changes and know that, in partnership with CDL, we will be able to better serve our community. I look forward to updating you on future developments, but in the meantime, please don’t hesitate to reach out to me at director@datadryad.org with any questions or comments.

Dryad welcomes Scheld as new Executive Director

Dryad is excited to announce the appointment of Melissanne Scheld as Executive Director.

Melissanne joins as Dryad embarks upon our 10th year of providing open, not-for-profit infrastructure for scholarly data, and as we begin a strategic partnership with California Digital Library (CDL) to address researcher needs by leading an open, community-supported initiative in research data curation and publishing.

We are pleased Melissanne is joining us at this auspicious point in Dryad’s trajectory. With over 25 years of experience working with the academic community, and with her knowledge of the scholarly communications industry, we are confident she will successfully lead Dryad into our second decade as a community-supported provider of open data services.

–Charles Fox, Dryad Board of Directors Chairperson and Professor, University of Kentucky

Melissanne most recently served as Managing Director of Publishers Communication Group, a scholarly publishing consultancy, and has previously held positions at the university presses of Cambridge, New York University, and Columbia.

To welcome our new ED and/or inquire about ways to get involved with Dryad, send an email to director@datadryad.org.

Introducing the Dryad BOD Class of 2021

We are thrilled to announce the latest additions to the Dryad Board of Directors.

Our 12-member Board is intended to be a diverse group, with a mix of background and skills useful to represent the various stakeholders in the Dryad community — publishers, researchers, technologists, funders, and libraries. BOD members are elected or re-elected each year by the membership to serve 3-year terms.

New members for 2018-2021

The following individuals have assumed their duties:

horstmannWolfram Horstmann has been Director of Göttingen State and University Library since 2014. Prior to that, he was Associate Director at the Bodleian Libraries of the University of Oxford, UK and CIO at Bielefeld University, Germany. He is Professor at the Information School of the Humboldt University in Berlin, teaching Electronic Publishing, Open Access and Open Science. He is biologist by training and worked on the epistemology of simulations for his doctoral thesis. Read more about Wolfram.

mangiaficoPaolo Mangiafico is the Scholarly Communications Strategist at Duke University and Director of the Scholarly Communication Institute. In his role at Duke, Paolo works with librarians, technologists, faculty, students and university leadership to plan and implement programs that promote greater reach and impact for scholarship in many forms, including open access to publications and data and emerging platforms for publishing digital scholarship.

suttonCaroline Sutton is Director of Editorial Development with Taylor & Francis. Before joining the company in October 2016, she was co-founder of Co-Action Publishing, a full OA publisher. She helped to found and served as the first President of the Open Access Scholarly Publishers Association (OASPA) and is a member of the present board.  At Taylor & Francis, Caroline has led efforts to roll out data sharing policies as well as initiatives related to open scholarship across subject areas.

uhlirPaul Uhlir, J.D. is a consultant in information policy and management. He was Scholar at the U.S. National Academy of Sciences (NAS) in Washington, DC in 2015-2016, and Director of its Board on Research Data and Information, 2008-2015. He was employed at the NAS in various capacities from 1985-2016. Paul has won several prizes from the NAS and the international CODATA in data policy, and is a Fellow of the American Association for the Advancement of Science (AAAS). Read more detailed information about his professional activities.

Officers for 2018-2019

Shout-out to our new slate of Board officers:

  • Charles Fox (Chair)
  • Johan Nilsson (Vice-chair)
  • Brian Hole (continuing as Treasurer)
  • Fiona Murphy (Secretary)

Ex Officio members

Filling out the Board roster are two members in Ex Officio (non-voting) status. We are thankful that Todd Vision, longtime BOD member and PI of grants supporting Dryad, will continue to serve. We also welcome Günter Waibel, Associate Vice Provost and Executive Director of California Digital Library, in this capacity to represent our recently-announced partnership with the CDL.

Finally, we wish to express our sincere appreciation to outgoing BOD members and officers for their work on behalf of Dryad and open data.

Dryad welcomes new board members

Today we celebrate our Board of Directors, and introduce three new members whose expertise and wide-ranging skills will help advance Dryad’s mission to provide free and easy access to data.

Dryad’s 12-member BOD supports and promotes our mission to make the data underlying scientific publications discoverable, freely reusable, and citable. The Board is comprised of diverse stakeholders, representing publishing, research, policy development, data networks, private funding, and scholarly organizations. BOD members are nominated by Dryad members and are elected or re-elected each year. They do not represent the organizations to which they belong; rather, they act as individuals in their involvement in the strategic planning and fiscal oversight of the company.

Who are the new members for 2017?

Adding to our esteemed Board of Directors this summer, we introduce our newest members:

Brian Hole (Class of 2020) will serve as treasurer of the Board. He is the CEO of Ubiquity Press, an open access publisher that focuses on alternative research outputs such as data, software, hardware, and bioresources. Previously, he managed the DryadUK project at the British Library, which focused on establishing a sustainable business model and publisher integrations, and also on building cost models for digital preservation. Brian brings a valued data-centric research background and detailed knowledge of open access publishing to Dryad this year.

 Fiona Murphy (Class of 2020) will serve as secretary of the Board. She is an independent research data and publishing consultant for institutions, societies, and commercial publishing companies and an Associate Fellow at the University of Reading. Fiona has written and presented widely on data publishing, open data, and open science. She has been involved in several research projects including PREPARDE, Data2Paper, and the Scholarly Commons Working Group. As an active member and sometime Co-Chair for several Research Data Alliance Groups focusing on data publishing policies, workflows, and accreditation systems, Fiona has organized several data-related events and sessions at scientific meetings.

Carly Strasser (Class of 2020) is a Program Officer at the Gordon and Betty Moore Foundation and is especially interested in open science and scholarly communication. She works in the Data-Driven Discovery Initiative, which is focused on promoting both the researchers and the practices required for high impact data-driven research. Previously, Carly was a Research Data Specialist at the California Digital Library where she was involved in development and implementation of many of the University of California Curation Center’s services, and worked to promote data sharing and good data management practices. Carly’s prior experience as a researcher in marine science and mathematical ecology has informed her work of ushering in the new era of open, transparent, and collaborative science.

We wish to thank our current and past members for bringing their expertise and passion to help advance Dryad’s mission and we look forward to their contributions and to another exciting year of open data.

And Now, the Numbers . . .

As the new year begins, we take note of the increasing diversity of fields represented in data archived at Dryad and review the numbers for 2016.

Dryad Grows into a General Repository

We are excited to see Dryad’s role in the preservation of data expand into new areas and fields in 2016. Researchers submitted more data involving human subjects and data from social media. In addition, a quick look at our most popular data shows that two of the top five downloaded packages were from the fields of cardiology and science journalism. While Dryad’s origins are in the life sciences, it is increasingly being used as a general repository for data from a myriad of fields.

Let’s take a look at the numbers for 2016:

Increase in Number of Data Packages and Data Files

Our curators were busy! The total number of published data packages (sets of data files associated with a publication) at the end of the year was a whopping 15,325. Our curators meticulously archived 4,307 packages, a 10% increase from 2015. The size of data packages also continued to grow – from an average of 481MB to an average of 573MB, an increase of about 20%.summary of Dryad data packages 2016

At the end of 2016, we were closing in on 50,000 archived data files; by January of this year, we passed that mark.

In a future blog, we’ll talk about the integration of new journals into the Dryad submission process, new members, and new partnerships. For now, we’ll just note that there was a 22% increase in the number of journals that have data in Dryad linking back to the article.

New Fields

We’ve seen a significant uptick in human subjects data and social media data this year, which has prompted us to develop an FAQ on cleaning and de-identification of human subjects data for public access. As the idea of what data should be preserved continues to broaden, submissions of these kinds of data will only increase. We’ll keep you updated about this trend in future blogs.

Top Downloads

Let’s take a look at the most popular data published in 2016, in terms of downloads. Among the top 5 downloads includes data on plant genetics, the early history of ray-finned fishes, and, not surprisingly in this age, the effects of climate change on boreal forests.

Also of interest are data from an article in Science evaluating how people make use of Sci-Hub, an open source scholarly library. Our guest blog on these data by science journalist John Bohannon generated a lot of interest this year and was one of our most popular blog posts ever.

Another significant development in 2016 came from the medical sciences. A comparison of coronary diagnostic techniques marked Dryad’s first submission from one of the top five cardiology journals, JACC: Cardiovascular Interventions.

The fact that 2 of the 5 top downloads come from fields outside of life sciences clearly indicates that data in Dryad now cover a broad range of fields.

Top 5 Downloads of Data Archived in 2016

Article Dryad DOI Number of Downloads
Wagner MR et al. (2016) Host genotype and age shape the leaf and root microbiomes of a wild perennial plant. Nature Communications 7: 12151. http://doi.org/10.5061/dryad.g60r3 3123
Bohannon J et al. (2016) Who’s downloading pirated papers? Everyone.  Science 352(6285): 508-512. http://doi.org/10.5061/dryad.q447c 2969
D’Orangeville L et al. (2016) Northeastern North America as a potential refugium for boreal forests in a warming climate. Science 352(6292): 1452-1455. http://doi.org/10.5061/dryad.785cv 741
Johnson NP et al. (2016) Continuum of vasodilator stress from rest to contrast medium to adenosine hyperemia for fractional flow reserve assessment. JACC. Cardiovascular Interventions 9(8): 757-767. http://doi.org/10.5061/dryad.f76nv 453
Lu J et al. (2016) The oldest actinopterygian highlights the cryptic early history of the hyperdiverse ray-finned fishes. Current Biology 26(12): 1602–1608. http://doi.org/10.5061/dryad.t6j72 423

Overall, we’ve had a great year and are delighted to be seeing a broader range of data from an increasing number of journals and fields. Thanks to our Board of Directors, members, and of course our staff for providing their support to make 2016 a notable year for Dryad!

New pricing structure with simplified terms and increased size limits

watering-can-simpler-2

Over the last few years, we’ve learned a lot about what is needed to curate, preserve, and provide access to data for the long term, as well as to sustain an independent not-for-profit organization. We’ve also paid close attention to the needs and wants of our user community and members. To meet these needs, we are revising our pricing structure for the first time since it was introduced in 2013.

  • Submissions initiated after 4 January 2016 will have a base Data Publication Charge (DPC) of $120US.
  • Pricing is now the same for all journals – there will no longer be an additional surcharge for non-integrated publications.
  • We encourage individuals and small groups to purchase bundles of DPC vouchers in advance and in any quantity. Purchases over 25 DPCs will enjoy a discount.
  • As a further user benefit, we will be doubling the maximum package size before overage fees kick in (to 20GB) and simplifying and reducing the overage fees.
  • We will continue to waive DPCs for researchers from World Bank low-income and low-middle-income economies upon request.
  • Membership fees are not changing, but Dryad members will be entitled to receive larger discounts on DPCs.
  • As always, there are no fees to download or reuse data from Dryad.
  • Integrating Dryad’s system with partner journals remains a free service.

Dryad’s Board of Directors will continue to keep a close eye on the repository’s sustainability progress. We anticipate this price structure will remain stable for the foreseeable future and are always seeking opportunities for savings and efficiencies.

We are grateful to our community supporters and take seriously the responsibility to ensure the long-term availability of the research data entrusted to us.

Prepaid data submission vouchers can be purchased at current pricing levels ($80 apiece) through January 4th (and at the new price of $120 apiece after that), by contacting help@datadryad.org.

Payment plans are either subscription or usage-based. Organizations and individuals may also make advance purchases of any number of DPCs and are eligible for bulk discounts for purchases of 25 or more.

What exactly do your DPCs cover?

The following breakdown of expenses reflects projected costs in the near future, extrapolating from historic growth rates. Approximately half of costs are associated with Repository Management, including membership-based nonprofit governance, communications with Dryad’s many stakeholders, members and partners, and upkeep of software systems (Repository Maintenance). Another quarter of the costs are due to the curation and user support provided to each data package, part of Dryad’s unique service offering and commitment to quality.

Since Dryad is a virtual organization, Infrastructure & Facilities largely covers server costs, digital storage, and interoperability technologies such as Digital Object identifiers (DOIs). A small fraction goes to community outreach activities to help encourage data publication best practices and raise awareness of Dryad. Administrative Support covers essential functions such as accounting and contract review.

Finally, Research and Development is essential for building new features to support changing technology and user expectations. R&D expenses are included here, but would ordinarily be covered through special project grants and not considered an operating expense paid for through DPCs.

We expect that as efficiencies are put into place, volume increases, and further economies of scale are realized, the percentage of the DPC supporting Repository Management will decrease and other areas, most notably Curation, will increase.

expense_breakdown-01