[DIYbio] Fwd: [open-science] [SCHOLCOMM] Libre open access, copyright, patent law, and, other intellectual property matters


From: john wilbanks <jtw@del-fi.org>

Date: Thu, Mar 22, 2012 at 10:35 AM
Subject: Re: [open-science] [SCHOLCOMM] Libre open access, copyright, patent law, and, other intellectual property matters
To: Peter Murray-Rust <pm286@cam.ac.uk>
Cc: Heather Morrison <hgmorris@sfu.ca>, open-science <open-science@lists.okfn.org>


I realize that I didn't make my point clear enough actually.

And I don't lump Heather in with Harnard. Heather asked a good question that I answered obliquely. For that I apologize.

I do not just want the ability for academics to text mine. I want there to be a robust market for text mining that includes companies who mine open access content for their own reasons as well as academics, and I want there to be a robust market of startups who provide those text mining services (and thus must make and distribute copies of corpuses as validation sets, as part of collaborations with academics that improve algorithms, and who also produce and sell the outputs of text mining). Right now text mining pretty much sucks, frankly, compared to what it ought it be.

Non commercial licenses are not just a way to prevent other publishers from reselling content, which is often the focus of the conversation, but a tax on startups and companies who want to treat the literature as data. Here's a short list of companies trying to do just that who are being hamstrung by closed access, and who would be blocked under NC terms: Personalized Medicine (providing auto-annotation of genotypes to doctors' offices), Selventa (providing auto-created hypotheses explaining high throughput experimental biological data), Ingenuity (providing large databases of assertions specific to diseases or tissues). Those three are simply the first ones that jump to mind in startup land. There's ~20 more I know of, and many more that I don't.

The uncertainty around content chills venture investment, to boot. If the web had been NC licensed, we would not have google, and pagerank would have remained where it started as an academic theory experiment. That would suck, in my opinion. And the big publishers know this, which is precisely why they add clauses that ban mining to existing licenses and want commercial restrictions. I don't think they're worried about resale. I think they're worried about getting their lunch eaten by new entrants who see the market differently, as Apple did with music, as Google did to Microsoft (and in turn Facebook did to Google). That's why Elsevier has an entire unit devoted to this stuff, run by extremely smart people.

Then there's all of big pharma and biotech, who all maintain libraries and subscriptions, but are often absent from these discussions because of their position on patents.

Non commercial restrictions have *side effects* that are bad for innovation and bad for science. We need entrepreneurs and not just academics.

This is not nearly as much of a problem in the humanities on first blush, but the reality is that as text mining gets better, faster, cheaper, and more subtle in the hard sciences, it will bring amazing tools to the humanities as well.

jtw


On 3/22/2012 1:06 AM, Peter Murray-Rust wrote:


On Wed, Mar 21, 2012 at 11:58 PM, john wilbanks <jtw@del-fi.org
<mailto:jtw@del-fi.org>> wrote:

   I'm going by the BBB declarations.


Thanks John, [and Klaus] and so am I.

I'm happy to see robust discussion on this list - we should avoid flame
wars.

It's somewhat unfortunate that there seems an operational division
between science and humanities. It would be nice to have a one-size-fits
all for "Open Access" but the reality may evolve to be different. The
Harnad-Morrison-Thatcher approach could be summed up as:
* the primary goal is that humans can somehow find a Gratis copy of the
work to read with their eyes. It is of secondary importance whether the
community has any rights.

The science community on the other hand wishes to make complete use of
the complete scholarly literature using modern technology to discover,
index, extract, re-use, recompute, re-assemble in whatever way their
imagination and technology runs to. (I wish to build an artificially
intelligent chemical amanuensis by semantic analysis of the complete
literature, for example).
* ANY licence other than BBB-compliant prevents this ABSOLUTELY. Any
publisher's contract prevents this absolutely.

It is profoundly unhelpful to this cause to have people pontificating
about absolute author's rights and quasi-religious approaches to solving
the problem. Harnad and Morrison know nothing about high-throughput
textmining, data extraction, eigenvector-based indexing, etc. If they
wish to publish their own work under NC I shan't fight it.

UK/PubMedcentral is crippled by the lack of explicit full-libre
permission to re-use it. 20 million scientific articles of which about
1% are legally minable and those are extremely difficult to discover. I
spent my "research" effort trying to find these, rather than actually
DOING the science from them. Last week my tools read 500,000 chemical
reactions from the patent literature, better as well as infinitely
faster than any human on the planet. Those reactions can help to find
new drugs, new ways of making drugs, new insights into chemistry.

The reality is that science can operate extremely well with CC-BY. I am
yet again preparing a clutch of articles for Biomed Central (a special
issue with 17 APC-based articles). BMC have been running for 10 years.
As far as I know there have been no serious misuse of the literature so
there is no need to "protect" CC-BY.

On a related point, institutional repositories are almost completely
useless for modern literature analysis. They do not carry explicit
machine-readable libre licences so we cannot by right use any of their
content. They are fragmented - instead of the UK having ONE repository
(say in the BL) which would be the rational thing that any scientist
would do they are fragmented over 200 universities at great additional cost.

Al that leads up to me thanking the RCUK for insisting on CC-BY and -
with other scientific organizations such as Wellcome, and the Libre
science publishers - making BBB-OpenAccess a reality. There is a great
deal more to do, but at least we have a model that works and that
politicians are listening to.


--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

--
------
john wilbanks
@wilbanks
http://del-fi.org

_______________________________________________
open-science mailing list
open-science@lists.okfn.org
http://lists.okfn.org/mailman/listinfo/open-science



--
- Bryan
http://heybryan.org/
1 512 203 0507

--
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To post to this group, send email to diybio@googlegroups.com.
To unsubscribe from this group, send email to diybio+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/diybio?hl=en.

  • Digg
  • Del.icio.us
  • StumbleUpon
  • Reddit
  • RSS

0 comments:

Post a Comment