Monday, May 15, 2006

Scan this book

Kevin Kelly in the NYT on the Google book project.

"There are dozens of excellent reasons that books should quickly be made part of the emerging Web. But so far they have not been, at least not in great numbers. And there is only one reason: the hegemony of the copy...

In preindustrial times, exact copies of a work were rare for a simple reason: it was much easier to make your own version of a creation than to duplicate someone else's exactly. The amount of energy and attention needed to copy a scroll exactly, word for word, or to replicate a painting stroke by stroke exceeded the cost of paraphrasing it in your own style. So most works were altered, and often improved, by the borrower before they were passed on. Fairy tales evolved mythic depth as many different authors worked on them and as they migrated from spoken tales to other media (theater, music, painting). This system worked well for audiences and performers, but the only way for most creators to earn a living from their works was through the support of patrons.

That ancient economics of creation was overturned at the dawn of the industrial age by the technologies of mass production. Suddenly, the cost of duplication was lower than the cost of appropriation. With the advent of the printing press, it was now cheaper to print thousands of exact copies of a manuscript than to alter one by hand. Copy makers could profit more than creators. This imbalance led to the technology of copyright, which established a new order. Copyright bestowed upon the creator of a work a temporary monopoly — for 14 years, in the United States — over any copies of the work. The idea was to encourage authors and artists to create yet more works that could be cheaply copied and thus fill the culture with public works.

Not coincidentally, public libraries first began to flourish with the advent of cheap copies. Before the industrial age, libraries were primarily the property of the wealthy elite. With mass production, every small town could afford to put duplicates of the greatest works of humanity on wooden shelves in the village square. Mass access to public-library books inspired scholarship, reviewing and education, activities exempted in part from the monopoly of copyright in the United States because they moved creative works toward the public commons sooner, weaving them into the fabric of common culture while still remaining under the author's copyright. These are now known as "fair uses."

This wonderful balance was undone by good intentions. The first was a new copyright law passed by Congress in 1976. According to the new law, creators no longer had to register or renew copyright; the simple act of creating something bestowed it with instant and automatic rights. By default, each new work was born under private ownership rather than in the public commons. At first, this reversal seemed to serve the culture of creation well. All works that could be copied gained instant and deep ownership, and artists and authors were happy. But the 1976 law, and various revisions and extensions that followed it, made it extremely difficult to move a work into the public commons, where human creations naturally belong and were originally intended to reside. As more intellectual property became owned by corporations rather than by individuals, those corporations successfully lobbied Congress to keep extending the once-brief protection enabled by copyright in order to prevent works from returning to the public domain. With constant nudging, Congress moved the expiration date from 14 years to 28 to 42 and then to 56.

While corporations and legislators were moving the goal posts back, technology was accelerating forward. In Internet time, even 14 years is a long time for a monopoly; a monopoly that lasts a human lifetime is essentially an eternity. So when Congress voted in 1998 to extend copyright an additional 70 years beyond the life span of a creator — to a point where it could not possibly serve its original purpose as an incentive to keep that creator working — it was obvious to all that copyright now existed primarily to protect a threatened business model. And because Congress at the same time tacked a 20-year extension onto all existing copyrights, nothing — no published creative works of any type — will fall out of protection and return to the public domain until 2019. Almost everything created today will not return to the commons until the next century. Thus the stream of shared material that anyone can improve (think "A Thousand and One Nights" or "Amazing Grace" or "Beauty and the Beast") will largely dry up.

In the world of books, the indefinite extension of copyright has had a perverse effect. It has created a vast collection of works that have been abandoned by publishers, a continent of books left permanently in the dark. In most cases, the original publisher simply doesn't find it profitable to keep these books in print. In other cases, the publishing company doesn't know whether it even owns the work, since author contracts in the past were not as explicit as they are now. The size of this abandoned library is shocking: about 75 percent of all books in the world's libraries are orphaned. Only about 15 percent of all books are in the public domain. A luckier 10 percent are still in print. The rest, the bulk of our universal library, is dark...

Having searchable works is good for culture. It is so good, in fact, that we can now state a new covenant: Copyrights must be counterbalanced by copyduties. In exchange for public protection of a work's copies (what we call copyright), a creator has an obligation to allow that work to be searched. No search, no copyright. As a song, movie, novel or poem is searched, the potential connections it radiates seep into society in a much deeper way than the simple publication of a duplicated copy ever could.

We see this effect most clearly in science. Science is on a long-term campaign to bring all knowledge in the world into one vast, interconnected, footnoted, peer-reviewed web of facts. Independent facts, even those that make sense in their own world, are of little value to science. (The pseudo- and parasciences are nothing less, in fact, than small pools of knowledge that are not connected to the large network of science.) In this way, every new observation or bit of data brought into the web of science enhances the value of all other data points. In science, there is a natural duty to make what is known searchable. No one argues that scientists should be paid when someone finds or duplicates their results. Instead, we have devised other ways to compensate them for their vital work. They are rewarded for the degree that their work is cited, shared, linked and connected in their publications, which they do not own. They are financed with extremely short-term (20-year) patent monopolies for their ideas, short enough to truly inspire them to invent more, sooner. To a large degree, they make their living by giving away copies of their intellectual property in one fashion or another...

The reign of the copy is no match for the bias of technology. All new works will be born digital, and they will flow into the universal library as you might add more words to a long story. The great continent of orphan works, the 25 million older books born analog and caught between the law and users, will be scanned. Whether this vast mountain of dark books is scanned by Google, the Library of Congress, the Chinese or by readers themselves, it will be scanned well before its legal status is resolved simply because technology makes it so easy to do and so valuable when done. In the clash between the conventions of the book and the protocols of the screen, the screen will prevail. On this screen, now visible to one billion people on earth, the technology of search will transform isolated books into the universal library of all human knowledge."

It's a long piece for a newspaper but worth the effort.

No comments: