Feeds:
Posts
Comments

Posts Tagged ‘images’

We present a guest post from researcher Falk Lüsebrink highlighting the benefits of data sharing. Falk is currently working on his PhD in the Department of Biomedical Magnetic Resonance at the Otto-von-Guericke University in Magdeburg, Germany. Here, he talks about his experience of sharing early MRI data and the unexpected impact that it is having on the research community.

Early release of data

The first time I faced a decision about publishing my own data was while writing a grant proposal. One of our proposed objectives was to acquire ultrahigh resolution brain images in vivo, making use of an innovative development: a combination of an MR scanner with ultrahigh field strength and a motion correction setup to remediate subject motion during data acquisition. While waiting for the funding decision, I simply could not resist acquiring a first dataset. We scanned a highly experienced subject for several hours, allowing us to acquire in vivo images of the brain with a resolution far beyond anything achieved thus far.

 MRI data showing the cerebellum in vivo

MRI data showing the cerebellum in vivo at (a) neuroscientific standard resolution of 1 mm, (b) our highest achieved resolution of 250 µm, and (c) state-of-the-art 500 µm resolution.

When our colleagues saw the initial results, they encouraged us to share the data as soon as possible. Through Scientific Data and Dryad, we were able to do just that. The combination of a peer-reviewed open access journal and an open access digital repository for the data was perfect for presenting our initial results.

17,000 downloads and more

‘Sharing the wealth’ seems to have been the right decision; in the three months since we published our data, there has been an enormous amount of activity:

A distinct need for data re-use

MRI studies are highly interdisciplinary, opening up numerous opportunities for sharing and re-using data. For example, our data might be used to build MR brain atlases and illustrate brain structures in much greater detail, or even for the first time. This could advance our understanding of brain functions. Algorithms used to quantify brain structures needed in the research of neurodegenerative disorders could be enhanced, increasing accuracy and reproducibility. Furthermore, by making available raw signals measured by the MR scanner, image reconstruction methods could be used to refine image quality or reduce the time it takes to collect the data.

There are also opportunities beyond those that our particular dataset offers. A recent emerging trend in MRI comes from the field of machine learning. Neuronal networks are being built to perform and potentially improve all kinds of tasks, from image reconstruction, to image processing, and even diagnostics. To train such networks, huge amounts of data are necessary; these data could come from repositories open to the public. Such re-use of MRI data by researchers in other disciplines is having a strong impact on the advancement of science. By publicly sharing our data, we are allowing others to pursue new and exciting directions.

Download the data for yourself and see what you can do with it. In the meantime, I am still eagerly awaiting the acceptance of the grant application . . . but that’s a different story.

The data: http://dx.doi.org/10.5061/dryad.38s74

The article: http://dx.doi.org/10.1038/sdata.2017.32

— Falk Lüsebrink

Read Full Post »

Our latest featured data package is from Alexandra Swanson and colleagues at the Snapshot Serengeti project, and accompanies their peer-reviewed article in Scientific Data.  It provides a unique resource for studying one of the world’s most extraordinary mammal assemblages and also for studies of computer vision and machine learning. In addition, data from Snapshot Serengeti is already being used in biology and computer science classrooms to enable students to work on solving real problems with authentic research data.

 lion

Snapshot Serengeti, CC BY-NC-SA 3.0

The raw data (which are being made available from the University of Minnesota Supercomputing Institute) consist of 1.2 million sets of images collected between February 2011 and May 2013 from 225 heat and motion triggered cameras, operating day and night, distributed over 1,135 sq. km. in Serengeti National Park in Tanzania.  This staggering trove of images was classified by 28,040 registered and ~40,000 unregistered volunteers on Snapshot Serengeti (a Zooniverse project) according to the species present (if any), the number of individuals, the presence of young, and what behaviors were being displayed, such as standing, resting, moving, eating, or interacting.

Remarkably, this vast army of citizen scientists was classifying the images faster than they were being produced, and each image set was classified on average by nine different volunteers.  This led to consensus classifications with high accuracy, 96.6% for species identifications relative to an expert-classified gold set.  Of the more than 300,000 image sets that contain animals, 48 different species were seen, including rare mammals such as the aardwolf and the zorilla.

zorilla

zorilla (image from Snapshot Serengeti CC BY-NC-SA 3.0)

The Dryad data package includes the classifications from all the individual volunteers, the consensus classifications, information about when each camera was operational, and the expert classification of 4,149 image sets as a gold standard.

References:

  • Swanson et al. (2015) Snapshot Serengeti, high frequency annotated camera trap images of 40 mammalian species in an African savannah. Scientific Data.  http://dx.doi.org/10.1038/sdata.2015.26
  • Swanson et al. (2015) Data from: Snapshot Serengeti, high frequency annotated camera trap images of 40 mammalian species in an African savannah. Dryad Digital Repository http://doi.org/10.5061/dryad.5pt92

Read Full Post »