Throwing the book at Google

ft.com

A 1960 sociological study of female Finnish students or an 1894 handbook on how to play cricket are probably at the top of no one's poolside reading list this year.

Long out of print, such works are more likely to be gathering dust in attics, languishing forgotten at the backs of people's bookshelves or, as in the case of these two volumes, mouldering in the Harvard and Wisconsin university libraries respectively. Of the estimated 40 million different books held by US libraries, well over half are unlikely ever to find their way back into a publisher's favour.

That makes an effort by Google, to burrow deep into the leading US research libraries to make digital copies of all the works it can lay its hands on, seem both ambitious and quixotic. The project, begun nearly five years ago, has also started scanning out-of-copyright works from libraries in other countries. A digital archive of all extant books – even ones in which few people are these days likely to show much interest – is carrying the internet company's mission to "organise the world's information” to the extreme.

Yet this mountain of fading literary oddments is now at the centre of a fierce debate in the book world that is about to come to a head.

After facing copyright lawsuits in the US over the digitisation project, Google reached a settlement last year that seemed to have something for just about everyone: publishers and authors, because it gives them a chance to make money from long-forgotten works; public and university libraries, as it provides them with a way to leap beyond their dead-tree stacks into the digital age; and readers, to whom it brings access to millions of works that would otherwise have remained out of reach.

But this agreement with the US book industry, which awaits court approval, has stirred up the sort of passions that always attach to books, those most cultural of manufactured objects. In particular, the deal has provoked the fear that a more centralised industry will arise as publishing turns digital, up-ending checks and balances put in place over decades.

"The book world has done really well out of decentralisation – anyone who has ideas, or access to a printing press, can take part,” says James Grimmelmann, associate professor at New York Law School, a leading critic of the settlement. Giving Google too much power over old, out-of-print works, he adds, could set the stage for its dominance of the broader digital book market: "Control over the past will translate into control over the future of books.”

The US Department of Justice has taken such concerns seriously enough to launch an investigation into the competitive implications of the settlement: it is due to submit its views to the court considering the case in the middle of next month. Before that, the European Commission has called its own hearing on the issue, to consider the impact on Europe's book industry and authors' rights.

The main focus of the settlement falls on out-of-print books that are still in copyright. These works probably account for 60 per cent or so of all books in the US, making them a massive – if heavily underused – intellectual resource. While Google's initial go-it-alone approach to digitising these works brought angst and lawsuits, the accord has turned it into an ally of the American book world. Unless copyright owners opt out of the plan, a Book Rights Registry to be run by representatives of the publishers and authors will have the power to license digital rights for all out-of-print books in the US to Google.

Google will then make parts of these works available through its search service, sell subscriptions to the entire database to university libraries and others – every library in the US will be offered a single free terminal to tap into the treasure trove – and sell access to full versions of individual works hosted on its computers. It will keep 37 per cent of the money from these sales, passing the rest to the registry to be paid out to copyright holders.

The undertaking is set to cost "hundreds of millions of dollars”, says Dan Clancy, head of the Google Books effort. Yet there is little business in old books: second-hand volumes are estimated to account for less than $1 billion of the $25 billion US books market. The scale of the ambition makes it the sort of thing that only a Google would contemplate – or be able to afford.

David Balto, a former justice department lawyer, argues that any antitrust concerns are dwarfed by the benefits the settlement will bring. "What Google is doing is incredible – from a competition policy perspective, you don't want to punish people who are risking millions of dollars doing things like this that haven't been done before,” he says.

Even the settlement's critics admit that it will bring immediate and substantial benefits, making millions of books widely available in the US for the first time. Yet its potential long-term impact on the shape of the digital book market has guaranteed that the settlement will attract close regulatory scrutiny, whatever its immediate attractions.

Critics fear that two aspects in particular could hand Google too much power, while also leaving a coterie of publishers and authors with disproportionate sway over setting prices for digital works, to the detriment of readers.

The first concerns the exclusive right that Google would have to distribute digital books whose copyright holders cannot be traced. These so-called "orphan works” may make up a large portion of all out-of-print tomes: Paul Courant, head of the University of Michigan library, estimates that they amount to 1-2.5 million of his collection of 8 million volumes.

Congress has failed in its own efforts to free up these works so they can be sold without the risk of claims later from the copyright owners. It is a peculiarity of class action law in the US, though, that private legal action can achieve a result that has eluded Congress: since Google and the new books registry would be free to sell works whose owners did not actively opt out of the court-approved settlement, they would assume a right not available to anyone else.

But even if Google is left as sole distributor of orphan works, do the benefits outweigh antitrust worries? "Google is certainly going to be in a position of power in out-of-print books – but out-of-print books aren't exactly hot commercial properties,” says Courant. Balanced against that are the benefits to readers: "Being able to use these orphan works is much, much better than nothing.”

Opponents say this understates the potential value to Google in the long run. Having the world's most comprehensive collection could make it the default first choice for book buyers, overshadowing Amazon.com's claim to be the world's biggest bookstore. "You're much more likely to turn to Google first because they'll have many more titles,” says the law school's Grimmelmann.

The international dimension to the debate over orphan works has also started to resonate, particularly in France, where a lawsuit against Google brought by local publishers is due to be heard next month.

Under the Berne convention, a long-standing international copyright agreement, copyright owners do not have to register in every country in order to protect their rights. The opt-out provision of the US settlement appears to fly in the face of that agreement by pre-empting the rights of anyone who does not come forward.

The publicity surrounding the case, and the creation of a single registry to administer rights, should encourage more rights holders outside the US to come forward, says Clancy at Google. But with some European publishers already suspicious of having their rights circumscribed by American litigation, a visceral opposition has been building – particularly since the benefits from the settlement will accrue only to people in the US.

A second controversy surrounds the intended Book Rights Registry. Similar agencies representing the collective interests of artists are familiar in other parts of the media industry, for instance, the music world. But these typically are the creation of a legislative process or operate under close antitrust scrutiny. The settlement tries to combine two conflicting objectives – to maximise the revenues to authors and publishers while ensuring the widest possible access to the out-of-print works. Whether the complex system of incentives it creates can have the desired effect is a source of considerable unease.

"The library subscription could be excessively expensive,” says Courant in Michigan, reflecting a widely held concern. Gary Reback, a Silicon Valley antitrust lawyer, adds that the registry may have an incentive to license its book rights only to Google in order to keep prices up, rather than encourage competing distributors.

Countering this, Clancy contends that Google's business model is based on obtaining the widest possible distribution: "Google's interest is to make it cheap.” Even if libraries do not buy a subscription, he adds, the terminal they will receive for visitors to access Google's digital files will leave them better off than now.

With scrutiny intensifying on both sides of the Atlantic, a moment of truth is at hand for Google and its new allies (including Pearson Education and Penguin, sister companies of the Financial Times). They can push ahead with their settlement and risk provoking a backlash. Or they can try to adjust the terms to defuse some of the criticisms.

Those changes could be relatively easy to make, say opponents. Representatives of wider interests, such as libraries and readers, could be included on the book registry to prevent it limiting distribution only to Google or seeking excessive prices, says Peter Brantley, director of the Internet Archive, a non-profit organisation that is working on a digital archive of its own.

The court that is due to approve the class action settlement could also find ways to extend the "orphan works” protections to distributors other than Google, says Randal Picker, a law professor at the University of Chicago – though legal opinions are divided on whether this is possible. Google itself says it supports the idea of legislation to resolve the problem.

With the Department of Justice set to issue its verdict in less than a month, its behind-the-scenes discussions with many of the interested parties have been intensifying, according to people involved. There is so far no public indication that any voluntary changes to the complex book settlement will be forthcoming. But it seems increasingly likely that adjustments will be needed if the millions of tracts, treatises, thrillers and tragedies already embedded in Google's vast memory bank are once more to see the full light of day.