Data Curation from Down Under: the 14th International Digital Curation Conference

img_5196.jpgIt was a long journey from Chapel Hill, NC to Melbourne, Australia, but it was definitely worth it to attend the 14th International Digital Curation Conference (IDCC). The IDCC is always a great event for people involved in digital curation and preservation, especially when it is in a beautiful city like Melbourne. I was excited to attend this year and to take part in a 10-minute lightning talk on the Data Curation Network (DCN) entitled “The Data Curation Network: A Curator Perspective”. (More on this later in this post.) I’d like to take this opportunity to share some highlights from the conference.

32258153577_6bdf6bd076_z

This theme of this year’s IDCC, “Collaborations and Partnerships: addressing the big digital challenges together”, fits perfectly with what the Data Curation Network is all about. The Data Curation Network puts into place a cross-institutional staffing model connecting a network of expert data curators to increase local curation capacity, strengthen collaboration and support the sharing of research data. (To read more about the DCN and Dryad’s participation in the network, see Elizabeth Hull’s previous blog post announcing Dryad’s participation in the DCN launch.)

40235616193_c930f23f41_mThe main conference was kicked off with a “Welcome to Country Ceremony” conducted by a Wurundjeri Community Elder, along with a welcome to the University of Melbourne from Gwenda Thomas, Directory Scholarly Services and University Librarian. Kevin Ashley, Director, Digital Curation Centre, also gave a welcome to IDCC19 that included a challenge to conference participants: “listen, talk, interact and be inspired to do something”.

40235541493_9b52a0d5ba_zThe opening keynote, which was presented by independent journalist Christine Kenneally and was entitled “Data, the creation of history and its impact on real lives“, related the compelling story of millions of orphans from around the world (including Australia and the US) searching for information about themselves. The orphans’ story highlighted the importance and direct impact of data on both a societal and an individual level, a theme that would emerge throughout  the conference.

After the keynote, the various presentations in the form of parallel sessions, posters and lightning talks began. Throughout the conference, these presentations were organized into broad topics such as:

  • Grand curation challenges across disciplines
  • Metadata
  • Trust47200267751_009c42e246_z
  • Data quality
  • Digital humanities
  • Examples and models / Models and tools
  • Research disciplines & data services
  • Research data management / Research data services
  • Digital curation & preservation
  • Building diverse and Inclusive Communities
  • Curating indigenous data
  • Skills

As a representative of the DCN, I took part in a lightning talk session with a presentation put together by Erin Clary (Dryad Senior Curator), Lisa Johnston (Principal Investigator for the DCN and Director of the Data Repository for the University of Minnesota) and myself. The presentation focused on the experiences Erin and I have had so far as curators with the DCN pilot. After Lisa gave a brief overview of the DCN, I described the training and preparation all participating curators undertook and what it was like for Erin and me to actually begin curating DCN submissions.

40235544193_9a0f000eb3_z

John Chodacki (Director, University of California Curation Center) gave a great presentation about the “Community Led Open Data Infrastructure: CDL & Dryad Partnership” in which he shared how and why the partnership came about and what it means going forward. John followed up immediately with another presentation about “The Research Organization Registry“. As an added bonus after the conference, John led the workshop “Accelerating Data Publication: new models for research institutions”. (For a summary of the workshop, see the blog post from the perspective of workshop attendee Dr. Richard Ferrers.)

The thought-provoking final keynote was presented remotely (in light of the recent US Government shutdown) by Dr. Patricia Brennan, Director, US National Library of Medicine. Her presentation, “Jumping into the stream of data curation“, highlighted the enormous amount of data curated each day by the National Library of Medicine. Dr. screen-shot-2019-02-28-at-2.47.44-pm.pngBrennan spoke of an “information tsunami”, the challenges inherent in curating all that data and what those challenges may mean for the future of data curation. Her presentation highlighted the shift in focus by data curation professionals over the years from pushing efforts to encourage data curation to figuring out how we move forward now that those efforts are paying off with a torrent of data given the limited resources available.

The conference came to an end all too soon with closing remarks by Kevin Ashley and Donna McRostie and an IDCC 2019 theme song that put a smile on everyone’s face. Next year, curators will do it all again at the 15th International Digital Curation Conference in (drum roll, please) … Dublin, Ireland!

40235555313_91457264a5_z
Continue reading

Some dos and don’ts for CC0

In 2011 Peggy Schaeffer penned an entry for this blog titled “Why does Dryad use CC0?” While 2011 seems like a long time ago, especially in our rapidly evolving digital world, the information in that piece is still as valid and relevant now as it was then. In fact, Dryad curators routinely direct authors to that blog entry to help them understand and resolve licensing issues. Since dealing with licensing matters can be confusing, it seems about time to revisit this briefly from a practical perspective.

Dryad uses Creative Commons Zero (CC0) to promote the reuse of data underlying scholarly literature. CC0 provides consistent, clear, and open terms of reuse for all data in our repository by allowing researchers, authors, and others to waive all copyright and related rights for a work and place the work in the public domain. Users know they can reuse any data available in Dryad with minimal impediments; authors gain the potential for more citations without having to spend time responding to requests from those wishing to use their data. In other words, CC0 helps eliminate the headaches associated with copyright and licensing issues for all stakeholders, leading to more data reuse.

So what does this mean in practical terms? Dryad’s curators have come up with a few suggestions to keep in mind as you prepare your data for submission. These tips can help you manage the CC0 requirements and avoid any problems:

DO:

  • Make sure any software included with your submission can be released under CC0. For example, licenses such as GPL or MIT are common and are not compatible with CC0. Be sure there are no licensing statements displayed in the software itself or in associated readme files.
  • Be aware that there are software applications out there that automatically place any output produced by the software under a non-CC0 compatible license. Consider this when you are deciding which software to use to prepare your data.
  • Know the terms of use for any information you get from a website or database.
  • Ensure that any images, videos, or other media that are not your own work can be released under CC0.
  • Be sure to clean up your data before submitting it, especially if you are compressing it using a tool such as zip or tar. Remove anything that can’t be released under CC0, along with any other extraneous materials, such as user manuals for hardware or software tools. Not only does removing extraneous files lessen the chance something will conflict with Dryad’s CC0 policy, it also makes your data more streamlined and easier to use.

DON’T:

  • Don’t add text anywhere in your data submission requiring permission or attribution for reuse. Community norms do a great job of putting in place the expectation that anyone reusing your data will provide the proper citations. CC0 actually encourages citation by keeping the process as simple as possible.
  • Don’t include your entire manuscript or parts of your manuscript in your data package. Most publications have licensing that restricts reuse and is not compatible with CC0.

I hope this post leaves you with a little more understanding about why Dryad uses CC0 and with a few tips that will help make following Dryad’s CC0 requirement easier.