Thursday, November 11, 2004

Creative Commons: Proposal to Explore a Science Commons

So important I hope they don't mind me copying it here in full:

Establishing a Science Commons

The Problem: The sciences depend on access to and use of factual data. Powered by developments in electronic storage and computational capability, scientific inquiry is becoming more data-intensive in almost every discipline. Whether the field is meteorology, genomics, medicine, or high-energy physics, research depends on the availability of multiple databases, from multiple public and private sources, and their openness to easy recombination, search and processing. In the United States, this process has traditionally been supported by a series of policies, laws, and practices that were largely invisible even to those who worked in the sciences themselves.

First, American intellectual property law (and, until recently, the law of most developed countries) did not allow for intellectual property protection of "raw facts." One could patent the mousetrap, not the data on the behavior of mice, or the tensile strength of steel. A scientific article could be copyrighted. The data on which it rested could not be. Commercial proprietary ownership was to be limited to a stage close to the point where a finished product entered the marketplace. The data upstream remained for all the world to use.

Second, US law mandated that even those federal government works that could be copyrighted, fell immediately into the public domain - a provision of great importance given massive governmental involvement in scientific research. More broadly, the practice in federally funded scientific research was to encourage the widespread dissemination of data at or below cost in the belief that, like the interstate system, this provision of a public good would yield incalculable economic benefits.

Third, in the sciences themselves, and particularly in the universities, a strong sociological tradition - sometimes called the Mertonian tradition of open science - discouraged the proprietary exploitation of data (as opposed to inventions derived from data) and required as a condition of publication and replication the datasets on which the work was based.

Each of these three central tenets is now either under attack or subject to serious reservations. For example, in the genetic realm, patent law has moved perilously close to being an intellectual property right over raw facts - the C's, G's A's and T's of a particular gene sequence. In other areas, complex contracts of adhesion create de facto intellectual property rights over databases, complete with "reach through agreements" and multiple limitations on use. More disturbingly, the US is considering and the EU has adopted a "database right" which actually does accord intellectual property protection to facts - upsetting one of the most fundamental premises of intellectual property: that one could never own facts, or ideas, only the inventions or expressions yielded by their intersection.

The Federal government's role is also changing. Under the pressure of the important (and in many ways admirable) Bayh-Dole statute federally funded research in universities is now pushed towards early proprietary exploitation; universities become partners in privatizing and exploiting the fruits of research. While this is a good idea when it encourages the conversion of science into useful products brought to market, it is questionable when the proprietary pressures occur "upstream" at the most fundamental level of data and research. At the same time, universities depend more and more on their intellectual property portfolios, both for income and for positioning in relationships and negotiations with other institutions and for-profit entities.

Under these twin pressures, the third leg of the tripod is also beginning to crack. Scientists may be bound up in confidentiality agreements. Proprietary concerns limit or prohibit the transfer of the full datasets on which they work. Often unconsciously, institutions have begun to encourage secretive practices they formerly frowned on. Science policy, too, has begun to change as universities can no longer be depended on to play the role of public defender for the public domain that they traditionally played in the legislative realm. Around the world, government departments have begun to look at datasets as a source of revenue to be exploited, rather than a public good to be provided. The important National Academy study, Bits of Power, records the tragic consequences that this tendency had in access to satellite and weather data.

Importantly for Creative Commons, many of the tendencies here involve both a collective action problem and a race to the bottom. Universities as a whole might be better off if more data were freely available; for an individual university to pursue such a policy alone is hard, and sometimes foolish: one is reluctant to give away that for which everyone else charges a high price. The same tendency occurs in different ways outside the university setting. The US government frequently buys the same data many times from private parties (private satellite companies, for example). Individual departments do not necessarily have incentives to try and make a deal that will benefit the government or the public as a whole. The same is true when government agencies provide data to private companies who add value to it, and offer it back with better search functions or improved interfaces, but subject to major contractual and legal restraints beyond the particular agency involved. Ideally, there would be standard agreements under which such deals were cut which maximized general social value and research availability, rather than only reflecting the budgetary or research interests of one particular agency.

The Search for a Solution: These facts have not gone unnoticed. Numerous scientists have pointed out the tragic irony that, right at the historical moment when we have the technologies to permit worldwide availability and distributed processing of scientific data and their concomitant promise for broadening collaboration and accelerating the pace and depth of discovery, we are busy locking up that data and slapping legal restrictions on transfer. Learned societies including the National Academies of Sciences, federal granting agencies such as the National Science Foundation, and other groups have all expressed concern about the trends that are developing. Much attention has been focused on proposals for legislative change, which - while important - will be both extremely hard to push through and an incomplete solution. Any solution will be need to be as complex as the problem it seeks to solve, which is to say it will be interdisciplinary, multinational, and involve both public and private initiatives. What's more, judicious balance is needed: the tendency to claim that property rights are never the answer, or that openness always solves all problems, must be avoided.

Enter "Science Commons": Creative Commons was formed to deal with a problem of access to materials caused by the conjunction of technological developments - computers' increasing capability to store and process data vastly enhanced in effect by interconnection via the World Wide Web - and legal change. With much at stake and so many stakeholders, the debate over control of creative work and information now tends to the extremes. At one pole is a vision of total control in which every last use of a work - or even data - is regulated and in which "all rights reserved" (and then some) is the norm. At the other end is a vision of anarchy, a world in which creators enjoy a wide range of freedom but are left vulnerable to exploitation.

In many arenas the default rule, or standard operating procedure, has become "lock it up". Balance, compromise and moderation, once the driving forces of an intellectual property system that balanced private reward with public gain and protection with innovation, have become endangered species. Creative Commons is working to revive these principles and practices. We use private rights to create public goods: creative works set free for certain uses.

Creative Commons now enables creators to select among various license options to make their work available to the public on generous terms, and then applies three layers of licenses (in legal, lay and machine-readable languages) and descriptive metadata to their work. Attendant to our development of licenses and Internet applications enabling creators to license their work to the public, we are also engaged in these projects (among others):

Tagging various kinds of file formats with Creative Commons metadata

Increasing the number of ways people can search for free or licensed work in an end-to-end system without relying on a centralized, authoritative database

Developing tools, services and educational projects to enable alternatives to maximum content control

Translating the Creative Commons system into numerous languages and national or regional legal systems (including special licensing provision for developing nations)

Connecting with the World Wide Web Consortium to promote "semantic web" tools to allow machines to communicate a richer set of information about files and pages on the Net.

Forging agreements with universities, tech companies, and others with competing interests to enhance the public's access to proprietary content. See http://creativecommons.org/learn/collaborators.

CC's charge initially was entirely in the cultural and copyright realms - in the world of music, texts, blogs, pictures, films, and so on. Nevertheless, at the first board meeting, the founding board members expressed strong interest in the possibilities of developing the creative commons model in the scientific area, should it appear that the technologies and expertise we were developing might usefully apply there. Several times, in fact, board members expressed the feeling that the Creative Commons approach might be more of a "killer app" in science than in culture. Recognizing that developing open pathways for scientific research will be complex and contentious, the Creative Commons board did not feel that at that point we had the expertise or the technical capability to enter this field. We now believe that we do.

What could Creative Commons bring to the world of science? In a single sentence, the answer is this. Creative Commons is a disinterested party with remarkable experience in the formation and deployment of well-written, accessible, machine- and human-readable licenses that guarantee wider availability of material while preserving some selected intellectual property rights. Along with scientists, patent and university IP lawyers and scholars, we believe that this particular conjunction of features might encourage an enormously valuable thaw against the freeze-in of scientific data. We anticipate that there will be a major role for well-written, standard form, machine- and human-readable agreements:

Between funders and grant recipients, requiring greater access to data.

Between universities and researchers, prohibiting collectively the most toxic types of restrictions on data, and guaranteeing a level and open playing field of access to data.

Between government agencies who are purchasing data from or providing data to, private commercial concerns, so as to develop standard terms that benefited the public and research as a whole.

In any or all of these areas, a Science Commons division of Creative Commons could play an important role.

Advantages:

i.) Disinterest: Unlike universities, scientists, learned societies, publishers, or the National Science Foundation, Creative Commons is neither a provider nor a recipient of research science dollars. We do not produce, consume, sell or distribute scientific data, nor rely on its openness, or its restriction for our existence. To differing extents, every other group at the table has a particular set of interests in the outcome of debates on this issue - some of which might be more or less congruent with the public interest. Our position of relative disinterest might help to facilitate a role in the discussions - whether as "honest broker," technical and legal advisor, or policy entrepreneurs, bringing new solutions to old problems.

ii.) Experience in Licensing Solutions: Many of the things that we have learned in forming the Creative Commons do not translate completely to the world of science policy. We dealt primarily with copyright - here the issues would also involve patent and trade secret. We were dealing with a very large number of individuals with little legal expertise who were not repeat players in the system. Here we would dealing both with individuals who fit that model to some extent (scientists, low level administrators in government departments). But we would also be dealing with well-funded sophisticated repeat players (universities, national funding bodies.) Nevertheless, a number of the types of expertise that Creative Commons has developed valuably apply here.

Machine Readability: It might be advantageous for some datasets to travel with their electronically expressed licenses. The ability to combine those datasets without worrying about obscure contracts hidden in some general counsel's office a continent away would be a major benefit.

Human Readability: We have considerable expertise in writing licenses that non-lawyers (in this case, scientists, researchers and administrators) can understand. If material is to be open in practice, rather than in theory, this will be a vital point.

The drafting process: The process of drafting the licenses in such fields will be extremely complex, involving negotiating skills, and the need for widely respected, disinterested and legally expert participants. It will also require considerable involvement from different communities, and - above all - the need to secure "buy in" from the various groups involved, all of whom have incentives to want it to succeed, but have differing interests that have to be explained, translated and negotiated. Creative Commons has a fair amount of experience in these tasks, experience that would appear to have a role to play in the world of science, supported by active pro bono representation by several top law firms.

Internet Architecture and Software Development: Our Technical Advisory Board, headed by one of our directors, Hal Abelson (Professor of Electrical Engineering and Computer Science at MIT), couples extraordinary sophistication in the architecture and functionality of the Internet with great skill in writing software for our applications.

iii) Recognition and good will: Launched in December, 2002, Creative Commons quickly established itself as an important, innovative player. Numerous prominent institutions and organizations have adopted Creative Commons* licenses to make their content available online to the public. Examples include

MIT's Open Courseware project,
http://ocw.mit.edu/index.html

Berklee School of Music, Berklee Shares online education project
http://www.berkleeshares.com

Rice University's Connexions, interactive courseware and repository,
http://cnx.rice.edu/

The Public Library of Science, a world-class open-access journal,
http://www.publiclibraryofscience.org/

OYEZ, audio archives of U.S. Supreme Court arguments since 1950s
http://oyez.org

Opsound, an archive of hundreds of openly licensed sounds and songs,
http://www.opsound.org

The Internet Archive, a nonprofit offering free hosting of text, audio, video, and web materials,
http://www.archive.org

eMultimedia Training Kit, training materials sponsored and used by UNESCO,
http://www.itrainonline.org/itrainonline/mmtk/index.shtml

A search on All-the-Web (a popular search engine) shows more than 1,000,000 licenses back-linked to Creative Commons. Searches on Google for the phrase "This work is licensed under a Creative Commons license" (the words attached to every licensed work) increased 316% between February to July 2003. In addition to substantial grants in place from the MacArthur Foundation and the Foundation for the Public Domain, in July 2003 Creative Commons was awarded a $1 million grant from the Hewlett Foundation.

Suggested Reading:

Bits of Power: Issues in Global Access to Scientific Data, National Academies Press (1997), http://www.nap.edu/readingroom/books/BitsOfPower.

Jerome Reichman & Paul Uhlir, A Contractually Reconstructed Research Commons for Scientific Data in a Highly Protectionist Intellectual Property Environment 66 Law & Contemp. Probs. 315 (Winter/Spring 2003), http://www.law.duke.edu/shell/cite.pl?66+Law+&+Contemp.+Probs.+315+(WinterSpring+2003).

Essays on the Public Domain, (special editor James Boyle) Law & Contemporary Problems 2003, http://www.law.duke.edu/journals/lcp/indexpd.htm.

No comments: