Looking ahead: a letter from our Executive Director

Just over three months into my role as Dryad Executive Director, I’m reflecting on my transition and what I’ve learned so far – what we’ve been doing and where we’re planning to go next in our programmatic work as well as organizational capacity and sustainability planning. But this transition has been more than just a new role for me, it comes at a time when as a scientific community we’re reflecting on our responsibilities, who our systems serve and who they don’t, our collaborations, what it means to work and share openly and our capacity to work together to do better together. This moment demands our attention and our focus, but these challenges and opportunities require more than short-term commitments. We need to continue to reflect and build, not just something that meets this moment, but that will sustain and move us forward towards our vision of sharing data and working together to create knowledge for science and society. Rather than just reacting, we can respond with intention – evaluating, supporting and re-envisioning our systems and our communities.  

Previously as Executive Director of The Carpentries, I worked with the community to create inclusive training, teaching people to work effectively with data and code, in a movement to empower all people to answer questions that are important to them. In that work, I noticed that another missing piece is access to the data to answer those questions. We need to both bring people to data and bring data to people, to democratize data, for each moment, each crisis, each person, each challenge and each opportunity. This is only work that can be accomplished if all researchers, organizations and institutions work together – sharing knowledge, creating pathways and connections and building relationships. Through my work as a researcher, with The Carpentries and now with Dryad, I’ve had the great opportunity of meeting many members of the data community, and while we have challenges in front of us, I am inspired by the commitment of so many to affect change and look forward to continuing these conversations and work within the Dryad community. 

Dryad’s connections 

Dryad is researcher-led and open, both in its publishing, and in its community, qualities that drew me to the organization. What I’ve continued to learn about Dryad in the last few months is that Dryad’s alignment with the scientific community and those promoting open, curated data publications enable us to focus on how to curate and publish quality datasets, at scale. I believe that part of the reason Dryad has seen such success in the last years, is by building like-minded partnerships and collaborations over its tenure. Working closely with those that hold similar values like the National Science Foundation, institutional resources (e.g., NESCent), the Data Curation Network, and now our partnerships with California Digital Library and Zenodo, we can have an even more effective reach in the scientific community. Going forward, it’s important to think not just about submitting data, but about listening and working together to pioneer new ways of making data re-use more prevalent and accessible.

Curation at our core

These ideals are heavily reflected in our data publishing platform, but also in our curation emphasis. To promote and publish FAIR data, it is essential that Dryad maintains its roots in curation, even if how curation looks may differ from Dryad’s inception. I’m finding that as the community shifts to value curation, we need to question what quality curation practices look like at multi-disciplinary repositories. Disciplinary repositories have long supported detailed curation, specific to the type of data being submitted. At Dryad, we need to assess what quality looks like at our scale and across disciplines, thinking about what level of checks are appropriate and attainable, and adjust both our workflows and the community’s expectations. We cannot do this alone.

In my first 90 days I have had the opportunity to learn from and share my experience at multiple venues spanning from the NIH’s General Repository workshop, meetings with the Data Curation Network, and more recently at the Open Publishing Fest with colleagues at Wormbase. In my next 90 days, I look forward to continuing to work with these groups and more that are interested in quality curation at scale, and how to implement this in a way that is accessible to researchers. 

Going forward

Dryad has and will continue to maintain broad researcher support, and as the current climate has increased a spotlight on the importance of data publishing, I want to take a moment to outline how I think Dryad can improve to more effectively publish FAIR, re-usable, data, and where we can go next.

It is essential that Dryad remains focused on researcher-adoption of best practices for open data publishing. We’ve seen this adoption, with increased submissions annually, and more stories of data re-use, and it’s important that as research evolves we evolve with the scientists and broaden our diversity of perspectives. Our platform and partnerships are focused on seamless workflows to accommodate increased submissions and we are very excited about the upcoming integrations with journal submission systems and Zenodo.

With increased submissions and an emphasis on adoption, we will need to continue to optimize for quality and volume. Dryad has done this successfully over the last ten years, and going forward, I will be investigating the right level of curation and resources required for the growing scale. This may mean investing more in automated FAIR checking, tools for researcher education during the submission process, and considering the role of institutional data curators looking to steward their research outputs. 

Beyond these operational improvements, it is important for Dryad to continue to push the envelope in data publishing. Ten years ago, Dryad was critical in supporting the development of data policies at publishers. Data availability statements and having a place to house data will always be important. But, growing and evolving with researchers means putting a larger emphasis on data re-use and equitable access, as data-driven discovery gains traction and researchers are eager to broaden the impact of their research through data publications. We need to consider how to make data re-use more accessible, thinking about how this practice can be promoted and encouraging best practices.

The research data community has given me a warm welcome, and it is a community that I am thrilled to be a part of. I am privileged to have the opportunity to steward Dryad into the future as a trusted multi-disciplinary data repository. Today’s challenges continue to show the importance of collective impact, working together towards a shared goal, and the essential value of backbone organizations in open data. Dryad has played and will continue to play a leading role in the research data ecosystem.  We have important work to do together, and as challenging as the current times are, it’s also shown people’s instincts to help each other. Thinking about how to better operationalize Dryad, to better support researchers in data curation, publishing, sharing and re-use, is something that I cannot do alone, and I am very excited and grateful to continue working with our staff, partners and the larger community to go further together.

Thank you Elizabeth! Our Associate Director moving on to new opportunities

In 2014 Elizabeth Hull stepped into the basement office of NESCent at Duke University to begin working with Dryad as a curator. Since then Elizabeth has worked in almost every role in the organization, leading curation efforts, business operations, communications, team building, writing grants, answering thousands of emails on HelpDesk, Interim Executive Director, Associate Director and most importantly connecting with so many researchers, librarians and publishers in our community. After so many years of service as a steward of Dryad and leader of the team, Elizabeth has decided to move on to new opportunities. 

As a team, we want to share our gratitude for her leadership, commitment, strength and grace under pressure and how she has welcomed and supported us all as team members. We will miss her work with Dryad, and her as a person on our team. 

As our Board Chair Caroline Sutton recently commented: “our continued presence as Dryad is a testament to Elizabeth’s standing in the community and among our staff as well as her skills and dedication.”

We know Elizabeth has supported, welcomed, talked with and listened to so many in our community over the years and made a significant impact. Please share any of your own notes or thank you’s with us or with her directly. 

Thank you Elizabeth, and we wish you all the best in your next endeavors!

Facilitating data sharing in times of crisis

Dryad has long committed to the sharing of open data, supporting authors in depositing data and providing FAIR curation to improve metadata and data quality. We believe this mission is important always, supporting the advancement of science, but in light of the current public health crisis related to COVID-19 we see the need for extra rapid curation and publication of pandemic related datasets. This public health crisis has changed the way we work as a society and has also changed scientific needs for rapid dissemination and analysis of research data and publications.

Amidst the COVID-19 outbreak we have seen amazing examples of the value of open data as well as the challenges of data variety, identifying sources and aggregating information. So many in the scientific community are putting exemplary energy and work into these efforts including Our World in Data, Nextstrain, The COVID Tracking project, GISAID and the GO FAIR Virus Outbreak Data Network to name just a few. These efforts are a tribute to scientists – as scientists and as people.

It is imperative that the community rapidly disseminate datasets related to COVID-19, and we must acknowledge that data quality is variable. Access to research data is critical, but it’s also essential that data are accompanied by high-quality metadata that can facilitate more effective and efficient reuse. Datasets need to be understood, checked and cleaned for personally identifiable information, standard metadata and accessible file types. Curation checks, while a slight delay in publication, can improve data quality for easier reuse and overall speed of the rate and quality of analyses.

During times of crisis when labs have closed down, we understand that many researchers are working with their previously collected data, applying computational approaches, analyzing or evaluating open datasets from other labs, and continuing to turn these data into knowledge. Dryad will continue to curate and publish all incoming research datasets, supporting all domains of research. 

We are also ever-conscious of the needs for rapid data curation and publication, and we are taking extra steps within the current climate to look out for public health, economic, sociologic and other datasets related to the pandemic. We are committed to working with the research community, including publishers and preprint providers, to facilitate the rapid and high-quality curation review and publication of any pandemic related data. Don’t hesitate to reach out. We’re here to help.

This is one crisis that has the world’s attention that requires rapid, coordinated response. But there are so many other moments that are important in individual communities, to specific topics, or to global challenges that may not seem as fast moving, such as climate change. In all of these moments, data sharing that is efficient and data we can trust is crucial. At Dryad we are thinking about our current practices, as well as what we can learn for future situations, curating and publishing datasets, so we can all continue to go further together.

Dryad & Zenodo: Our Path Ahead

In July, 2019 we were proud to announce a funded partnership between Dryad and Zenodo. Today, we are excited to give an update on our future together. 

Dryad and Zenodo have both been leading the way in open-source data, software, and other research outputs publishing for the last decade. While our focus and adoption mechanisms may have been different, we’ve had similar values and goals all along: publish and archive non-traditional research outputs in an open and accessible way that promotes best practices. 

In looking to expand our capacities for sharing data and software, it became clear that we could each benefit from the other’s expertise. Dryad has long focused on research data, curating each dataset published, and working in close coordination with publishers and societies to support journal data policies. Zenodo, based at CERN, builds on strong infrastructure capacity and has focused on software publishing and citation. It was clear that by working together, leveraging each other’s expertise, we could better achieve our goals.

Notably, we believe researchers should have an opportunity to publish curated data, software, and other research outputs at a trusted, open source set of repositories in a seamless way.

At the beginning of February, we brought our two teams together to understand the repository systems, roadmaps, and to map our work ahead. We have broken down this work into a couple of segments and will be beginning with our first project, as noted on our Github, as “DJ D-Zed: Mixing Up Repositories”. In other words, we will be integrating our two systems to lower the barrier for researchers who want to follow best practices publishing their software, data, and supporting information. The first direction of focus is publishing from Dryad to Zenodo.

Image from iOS

So, what does it all look like?

This project entails re-imagining the Dryad upload interface to expand the scope of upload to accommodate researchers uploading more than data. Within this interface, through a series of declarations and machine reading, we will triage data, software, and supporting (other) files. Data should be curated and published at Dryad. Software requires a series of different license options, metadata, and other attributes and supporting files benefit from a previewer, so these files are more appropriately published at Zenodo. 

After curation, once the items are ready to be published, it is essential that we can link up the work with their DOIs and citations to both. As Dryad and Zenodo each mint DOIs for published works, it is our responsibility to expose the relationship between the software, data, and other citations so users can find all related work. The benefit of having separate citations for software and data will allow for more specified citation practices at journals, in preprints, etc. 

Image from iOS copy

It is essential that we acknowledge the importance of user testing. We have identified our minimum viable product, but the look and feel of this relies on close collaboration with our user experience teams and researcher user testing. This integration can only succeed if researchers find the benefits of using one entry point for two repositories, and are educated along the way about best practices for data and software. We’ll be planning opportunities for feedback at specific milestones, and appreciate comments via email or github comments along the way. 

What happens next

Our partnership relies on cross-organization co-development. Our teams have been spending time to understand how Dryad and Zenodo both function to ensure we are building for success for each of our user communities. Our initial user testing is about to ramp up, and we have begun the exploration into backend development to tie our systems closer together. As avid open-source supporters, all of our work will be tracked publicly on Github. Our code and documentation will also be available as new features are released.

User testing our workflows with researchers will help guide our development, but we also need to understand how this work can support Dryad and Zenodo’s larger communities: institutions, libraries, publishers, societies, funding agencies, and others that have a stake in research data and software publishing. We will have regular opportunities for feedback and we hope you will weigh in.

Check out our blogs for updates as well as our Twitter to hear about upcoming meetings we will be presenting at. And If you have feedback please as always get in touch with our Product Managers at Dryad and Zenodo.

 

Promoting our Open Source Communities

Dryad believes in the power and potential for Open Source solutions to tackle the challenges that face scholarly communications.  As part of this strategy, the Dryad platform is completely Open Source and our code is made publicly available on GitHub.  In addition, we are continuously striving to build partnerships with other Open Source projects and help grow the Open Source communities we rely on.

In an effort to promote and elevate Open Source projects from across the scholarly communications space, Dryad is proud to partner with eLife Innovation (elifesciences.org/about/innovation) and FORCE11 (force11.org) on a series of Open Source Community Calls. These calls are an informal way to discuss and learn about emerging and established projects that promote open approaches to publishing datasets, articles and preprints as well as discovery, evaluation, and more.

The goal of each call is to allow for Open Source projects to give updates on recent releases or significant changes. Each call will have pre-selected presenters as well as time set aside for additional attendees to jump in with their own updates. All webinars will be recorded and summarized for future reference.

Next Call: February 25

The next call will be on Tuesday, February 25, 2020, 8am PT / 11am ET / 4pm GMT.  Whether you have developments to share, or simply would like to listen in to hear what’s new, please register to join the call.

On February 25, we will hear from:

  • Popper, a tool that allows researchers to automate the execution and validation of computational and data-intensive experimentation workflows;
  • Outbreak Science Rapid PREreview (OSrPRE), a platform for the rapid review of outbreak-related preprints;
  • Open Climate Knowledge, a fully open, collaborative research project to gather climate change knowledge using data-mining and open publishing; and
  • Open Publishing Awards, update from the organizers of an annual celebration of the Open Source software that supports our publishing communities.

Please join us for a lively discussion as project leads and contributors from across the world share their work on exciting projects that are using cutting-edge technology to drive forward open science and research communication.

Join the discussion

More details about the presentations and the opportunity to contribute your own updates are on the open agenda. Please register to join the call.

The agenda is open to anyone who would like to present, in five minutes or less, an open-source project that has relevance to open science and research communication

Welcoming the Chan Zuckerberg Initiative to Our Member Community

We are proud to welcome our first philanthropy to join the Dryad member community: Chan Zuckerberg Initiative (CZI).

By joining Dryad, CZI will cover the cost of curation and preservation for their grantees’  data publications, supporting them in following best practices for open data.

“The discoverability and availability of research data is a critical component of an open, reproducible, and verifiable scientific ecosystem, and Dryad provides essential infrastructure in support of this mission. The Chan Zuckerberg Initiative is pleased to join the Dryad community.”

– Alex Wade, Open Science Program Manager (CZI)

We are thrilled that CZI will be joining publishers and institutional members in lowering the barrier for researchers to publish their curated research data. For more information about our memberships and to join the community, get in touch.

Institutional member round-up and shout-out

As we settle in to the new year, we’re thrilled to report that Dryad’s institutional member community continues to expand and diversify. This should only pick up steam as we welcome our new Executive Director Dr. Tracy Teal next month, who brings significant community-building experience.

We applaud these new Dryad members in their efforts to support data stewardship and research on their campuses and beyond:

* (our first Australian institutional members!)

The Twitterverse seems excited by these developments:

This slideshow requires JavaScript.

To join as an institutional member, contact us today!

Deep Roots & Strong Branches: A Recap and Preview of Dryad’s Development Plans

Happy 2020! Kicking off the new year, our product development team wanted to take a moment to introduce our development processes and provide a glimpse into Dryad’s future directions. 2019 was an exciting year with our growth of 15% in submissions and the release of our new Dryad. This release was the culmination of a year and a half of work building a new, combined product development team (at Dryad and CDL) and developing new features to support Dryad’s user base. Since then, the work has not stopped. Our team has been working to continually meet user needs and better our services. 

Image from iOS.jpg

Members of the Product Development Team launching the new Dryad in September, 2019 (Left to Right: Daniella Lowenberg, Ryan Scherle, Marisa Strong, Scott Fisher, Brian Riley)

 

The Dryad development process

The Dryad product development team follows agile methodologies, working and releasing in  two-week sprints. This means we prioritize feature development and bug fixes based on user needs (which are ever evolving). This work is tracked on our public project board here.  Feature development also includes working with our user experience team to design interfaces that are both accessible for and understood by our users. Outward-facing features are tested for specific user groups (researchers, curators, members, etc) before development and before each release. At the end of each sprint, we post our release notes covering at a high (and sometimes technical) level what was completed. 

This type of development work means that we depend on community feedback to help identify the features necessary for making data publishing as easy as possible and for ensuring that published datasets are usable. There are hundreds of features we would love to build or enhance, and hearing productive feedback from the community helps to guide our development priorities. If you have a feature request, or would like to report a bug, you may log a ticket here. Our product manager consistently grooms through cards and will be in touch with more questions when that work is prioritized.

What we’ve been building

In the last three months, we have been primarily focused on ensuring the new platform can support the growing Dryad community. This means building up a robust, accessible platform and enhancing researcher facing features.

One of Dryad’s key strengths is its high adoption rate. This means that the platform receives heavy traffic loads. To support these loads over the long term and as the user base grows, we have been putting in various reinforcement features like load balancing our servers, improving reliability of our downloads, and actively monitoring/blocking bots as necessary to ensure the site can avoid any downtime.

Our other development work has included addressing accessibility and feature optimization, including:

  • Adjustments to our interface to be a more accessible service for our users
  • Enhancements for the auto-fill features (journal name, institutional affiliations) to reduce lag and better the author submission process
  • Updating our DataCite schema, allowing for Dryad to send author institutional affiliations (RORs) to DataCite, enabling better tracking of dataset publications by affiliation and support consumption by initiatives like FREYA and Make Data Count.

This foundational work is key to strengthen the system and prepare for new feature development work in 2020 and beyond. 

Where we are headed

Continuing to work in our two-week sprints, we will be building essential features for the researchers using Dryad (e.g., integrations, geolocation) as well as more complex functionality for our growing institutional and publisher member communities (e.g., integrations, reporting, data metrics aggregation). We also have embarked on a couple of larger projects that we are excited to share.

  • Zenodo – Dryad Partnership: Following on our announcement in July, 2019, we have embarked on a project to integrate Zenodo and Dryad, with a goal to provide researchers with a more seamless data, code, and other materials publishing process. While the initial work has already been scoped, our official kick-off meeting is in a couple of weeks and we will update the community shortly thereafter with our project plans.
  • Editorial Manager & ScholarOne Integrations: Since many Dryad authors publish data in conjunction with an article, we have been building a direct integration with Editorial Manager, a leading journal submission platform. This work will allow for researchers submitting to a journal that uses Editorial Manager to have the option to publish their data at Dryad without actually leaving the Editorial Manager (article submission) system. We look forward to sharing more information about this implementation in the spring. We have also been working to map a similar integration with ScholarOne that will enable thousands of journals to integrate directly with Dryad.

Our open REST APIs are documented and available for use. We have been talking with undergraduate and graduate level students looking for coding projects to build integrations into our platform with R, Python, Jupyter, rOpenSci, and Binder. If you are interested in working with our APIs, get in touch!

We have a busy year ahead and we look forward to working with both researchers and research supporting communities, continuing to make data publishing as seamless as possible. Follow along our blog and twitter for further updates.

 

Tracy Teal named new Dryad ED

tracytealAfter an extensive search led by Dryad’s Board of Directors, we are proud to announce that Dr. Tracy Teal will join Dryad as Executive Director beginning February 17, 2020. She has extensive experience leading a global, community-oriented non-profit, and we’re looking forward to working with her towards Dryad’s vision of a world where research data is openly available, integrated with the scholarly literature, and routinely re-used to create knowledge.

The Board and staff are extremely pleased that Tracy will be joining Dryad and believe that her extensive experience building communities and growing membership will help Dryad continue its upward trajectory. BOD Vice-Chair Johan Nilsson states:

We are very happy to welcome Tracy as our new Executive Director. With Dryad’s partnerships with CDL and Zenodo and the recently launched institutional memberships, we have very exciting times ahead of us. We are fully confident that Tracy’s background and enthusiasm for open science makes her perfectly positioned to lead Dryad into this future.

Tracy was most recently the Executive Director of The Carpentries and a co-founder of Data Carpentry, where she helped lead the organization through growth and transition. She received her PhD in Computation and Neural Systems from California Institute of Technology and was an NSF Postdoctoral Researcher in Biological Informatics. She worked at Michigan State University as a Research Specialist with the Institute for Cyber-Enabled Research and then as an Assistant Professor in Microbiology. Throughout her career she’s been working to empower people to work with data. Tracy told us:

I am honored to be joining the Dryad team and have the opportunity to continue working to democratize data with the Dryad platform, partnerships and community. Data sharing, access and re-use as a public good is essential to the future of knowledge creation and the potential impact of data on science and society.

NSF Workshop Overview: Focusing on Researcher Perspectives

Since its founding, Dryad has hosted a researcher-led, open data publishing community and service. With the California Digital Library partnership in 2018, and reflecting on a decade of Dryad’s existence, we have spent time exploring what it means to remain a community-owned data publishing platform. By convening publishers, institutions, and other scholarly communications stakeholders to discuss the meaning of community-ownership, we have begun to understand how research-supporters see their role in the Dryad community and leadership. But to better understand the meaning of “researcher-led”, we wanted to hear about researchers’ perspectives on community-led open infrastructure. 

With the support of a National Science Foundation Community Meeting grant (award #1839032), we hosted a meeting  on October 4th, 2019, with folks from the founding Dryad research communities. Going back to our roots, gathering both researchers that founded Dryad as well as early career researchers in Ecology and Evolutionary Biology, we held a day-long event centered around asking a diverse group of researchers: what does it mean for Dryad to remain researcher-led?

Focusing on research perspectives 

Kicking this off, we found it essential to hear from researchers themselves on how they use data, what their policies are, and their thoughts on how data re-use could be better suited to their use cases. Listening to researchers that are in different levels of their careers, we could see broad similarities but also meaningful variance in how even within the Ecology and Environmental Biology fields there are very different needs and uses for similar research data. 

We explored these dynamics through a series of presentations.  Ashley Asmus, a graduate student involved in the DroughtNet and NutNet projects explained the large amount of data they depend on across 27 countries, which could benefit from a more mature data management infrastructure. Dr. Lizzie Wolkovich introduced her lab’s new data management policy, requiring open sharing of data. And Dr. Karthik Ram, explained his perspective on what the data world could learn from the software world in terms of making things as easy as possible, with a bottom-up approach.

Image from iOS copy 2

Dr. Karthik Ram presenting on his experience working with open source software

Dryad and the disciplinary repository landscape

Before diving into Dryad-specific discussions, we took time to have a large-format discussion with guests from BCO-DMO, a repository for Oceanographic data as well as folks from Arctic Data Center, both National Science Foundation funded discipline specific repositories. It was evident that researchers do not feel they have proper guidance on which repository to use, even when funders feel this piece is clearly stated. Beyond it being a mandate, it’s important for researchers to submit to these repositories as discipline specific repositories typically provide richer curation than multi-disciplinary “general” repositories. A heavy theme that emerged was how Dryad and others that are embedded in the article publishing processes could ensure submitted data are going to the right home.

Meeting user needs

Splitting the room based on user interests in submitting and publishing data or re-using data in Dryad, we turned the event space walls into post-it note exhibits. Researchers wrote down as many features and use cases they could think of for either submitting data or using data. Within their groups they then clustered and prioritized these features. Interestingly, the majority of participants chose to focus on data re-use, reflecting the change in open data acceptance amongst the community they represent. Some of the highest priority features in this arena were about integrations and development of software tools that make the curated data more usable. For those focusing on submission the top rated features were around crediting back to funders and institutions, as well as relations to the scripts and code used to analyze the data.

Image from iOS copy 3

Dr. Sally Otto representing the “Publishing Data” group discussion

Image from iOS copy 4

Researchers clustering and prioritizing data re-use features

Maintaining a researcher-led community and platform

Circling back to the opening question we prompted the group to think about their perceptions of what it means for researchers to be leading the Dryad community. Many of these perspectives centered around transparency in marketing, true costs, and the added values. A big note was on how we can overcome barriers like those who do not have funding to publish data. Researchers raised the point that they may not be able to cover the cost of a data publishing charge, even at a respected US-based institution. Questions of how curation, integration, and open-source values can be inclusive of these communities struggling for funding prompted us to consider how disparate and diverse scientific research may be, even within the same domain. We received innovative ideas related to business models for supporting a broader audience of researchers as well as outreach ideas reflecting the need to integrate deeper within the open-source software community.

Working in conjunction with the open repositories (BCO-DMO, Arctic Data Center) and repository networks (DataONE) present at the workshop, and continuing to be led in the forms of governance and product management by researchers, Dryad and California Digital Library are striving to both understand and promote proper practices for community-ownership in open source data publishing. While this was a one-day event, we aim to continue to engage with broader research communities and encourage any researcher to get in touch with us if you have feedback or ideas for how you can get involved in our community.

CDL and Dryad thank the National Center for Ecological Analysis and Synthesis (NCEAS) for giving us the space to hold this meeting as well as the National Science Foundation for granting meeting funds.