PLoS: Open Data Means Better Science

Article excerpts.

The Open Knowledge Foundation: Open Data Means Better Science
Jennifer C. Molloy, Department of Zoology, University of Oxford,
Oxford, United Kingdom

http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001195

Molloy JC (2011) The Open Knowledge Foundation: Open Data Means Better
Science. PLoS Biol 9(12): e1001195. doi:10.1371/journal.pbio.1001195


The Open Knowledge Definition

The definition of "open", crystallised in the OKD, means the freedom
to use, reuse, and redistribute without restrictions beyond a
requirement for attribution and share-alike. Any further restrictions
make an item closed knowledge. It also emphasises the importance of
usability and access to the entire dataset or knowledge work:

"The work shall be available as a whole and at no more than a
reasonable reproduction cost, preferably downloading via the Internet
without charge. The work must also be available in a convenient and
modifiable form."

This is an important consideration for scientific data where in some
cases data is accessible, for example, in online supplements to
published papers, but is not licensed to be reuseable; or it's
accessible and reuseable but in a form that inhibits capture and
modification. Prior to online supplementary materials, requesting and
obtaining permissions and data was an extremely time-consuming
process, but even with instant downloads, deciding what rights one has
to reuse data can be confusing due to a lack of licensing and clear
terms of use. In some cases, the supplementary data associated with
papers is open even if the article itself is not; but this is often
not explicit. Clear labelling and licensing is vital to save
scientists the many hours they may spend discovering the openness or
otherwise of datasets and becomes even more imperative as computerised
analysis of the scientific literature increases, for example via data
and text mining.

....

Encouraging scientists to share their data is a challenge, even when
it directly supports published work. A 2009 report by the Research
Information Network [1] found that some researchers were unwilling to
share their data openly due to fears of exploitation, particularly for
datasets where they felt they could extract multiple publications;
another problem is the lack of career rewards, recognition, or
incentives to publish data, which makes it difficult for researchers
to justify the time and effort required to make data available.

However, there is top-down pressure to move towards open data
publication from funders such as the Wellcome Trust and the United
Kingdom Research Councils as well as the United States National
Institutes of Health (NIH), which published a joint statement to that
end in February 2011 [2]. The European Commission and the Royal
Society are both leading major enquiries into the future of the
communication of scientific information, with reports due later this
year. Open data in science has even appeared on government agendas; a
recent report from the UK House of Commons Select Committee on Science
and Technology examined research integrity and the peer review process
and concluded that:

"Access to data is fundamental if researchers are to reproduce,
verify and build on results that are reported in the literature … The
presumption must be that, unless there is a strong reason otherwise,
data should be fully disclosed and made publicly available. In line
with this principle, where possible, data associated with all publicly
funded research should be made widely and freely available…The work of
researchers who expend time and effort adding value to their data, to
make it usable by others, should be acknowledged as a valuable part of
their role" [3].

....

Is It Open Data?

Requesting data from other researchers can be a tortuous and sometimes
fruitless process. In a 2006 survey, 50.8% of US researchers reported
that data withholding had exerted a negative effect on the progress of
their research [7]. This problem could be overcome by sharing data
freely online, but as discussed previously, discovering the terms of
use of data can be a difficult and time-consuming task as this
information is often not explicitly stated at the point of data
viewing or download.

With this in mind, one of the first tools that the Open Data in
Science working group created was "Is It Open Data?" (IIOD?;
http://www.isitopendata.org/), a web application based on civil
society websites such as What Do They Know? (WDTK?; http://www.whatdotheyknow.com).
WDTK? allows users to make Freedom of Information requests for public
sector or government information in the UK and records the resulting
correspondence as a permanent and visible record in the public domain.
In much the same way, IIOD? enables interested parties to request the
open or closed status of data and data licensing details from
providers such as academic publishers, research organisations,
nongovernmental organisations, and all others making data available
online.

It has already been used to contact major scientific journal
publishers regarding the status of data in the supplementary
documentation associated with published papers, and we would encourage
others to contact their own journals of choice where data policies are
unclear. In our first round of enquiries, the openness of data in
Public Library of Science (PLoS) and BMC publications was confirmed,
while Nature Publishing Group also stated that raw data extracted from
their publications may be used as open data, with limited caveats.
Over time, extensive and systematic requests to journals and other
data providers are expected to build up a collection of position
statements on data reuse that are currently unavailable without
searching through the journal or publisher's websites individually.

....

References

1. Research Information Network (2008) To share or not to share:
research data outputs. Available:
http://www.rin.ac.uk/our-work/data-manag​ement-and-curation/share-or-not-share-re​search-data-outputs.
Accessed 27 October 2011.
2. Walport W, Brest P (2011) Sharing research data to improve
public health. Lancet 377: 537–539. doi:10.1016/S0140-6736(10)62234-9.
3. House of Commons Science and Technology Committee (2011) Science
and Technology Committee – eighth report. Peer review in scientific
publications. http://www.publications.parliament.uk/pa​/cm201012/cmselect/cmsctech/856/85602.ht​m.
Accessed 27 October 2011.
4. Global Biodiversity Information Facility (2011) New incentive
for biodiversity data publishing [press release]. Available:
http://www.gbif.org/communications/news-​and-events/showsingle/article/new-incent​ive-for-biodiversity-data-publishing.
Accessed 27 October 2011.
5. Nyman T, Vikberg V, Smith D. R, Boevé J (2010) How common is
ecological speciation in plant-feeding insects? A 'Higher' Nematinae
perspective. BMC Evolutionary Biology 10: 266. doi:
10.1186/1471-2148-10-266.
6. Nyman T (25 May 2011) On the unbearable lightness of mandatory
data sharing. BioMed Central Blog. Available:
http://blogs.openaccesscentral.com/blogs​/bmcblog/entry/on_the_unbearable_lightne​ss_of.
Accessed 27 October 2011.
7. Vogeli C, Yucel R, Bendavid E, Jones L, Anderson M, et al.
(2006) Data withholding and the next generation of scientists: results
of a national survey. Academic Medicine 81: 28–136. doi:
10.1097/00001888-200602000-00007.
8. Piwowar H. A, Day R. S, Fridsma D. B (2007) Sharing detailed
research data is associated with increased citation rate. PLoS ONE 2:
e308. doi:10.1371/journal.pone.0000308.
9. Piwowar H. A, Vision T. J, Whitlock M. C (2011) Data archiving
is a good investment. Nature 473: 285. doi:10.1038/473285a.
10. Samwald M, Jentzsch A, Bouton C, Stie Kallesøe C, Willighagen E,
et al. (2011) Linked open drug data for pharmaceutical research and
development. J Cheminform 3: 19. doi:10.1186/1758-2946-3-19.

--
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To post to this group, send email to diybio@googlegroups.com.
To unsubscribe from this group, send email to diybio+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/diybio?hl=en.

  • Digg
  • Del.icio.us
  • StumbleUpon
  • Reddit
  • RSS

0 comments:

Post a Comment