TidBITS: Google Books Settlement

Series: Google Books Settlement

Should Google be anointed the sole source of out-of-print, orphaned books online? That's only one of the many points of contention generated by the Google Books project, which aims to bring online vast libraries of information, much to the chagrin of groups like the Authors Guild and the Association of American Publishers.

Show the full text of all articles

Article 1 of 5 in series

Media Creation | 29 Oct 2008 | Listen

| Print

Authors and Publishers Settle with Google Book Search

by Glenn Fleishman

Authors and publishers have agreed to settle with Google over its book search program. The surprising part is that the resulting agreement will likely make millions more books widely available, benefit the public, and increase revenues to Google, authors, and publishers. Show full article

Hide full article

Google wants to index all knowledge, and it thought that scanning a few tens of millions of books might be a good addition to the compendium of billions of Web pages, PDFs, and Word documents they already offer. The only trouble? Most of the books they wanted to scan are still under copyright protection. This caused the Association of American Publishers (AAP), the Authors Guild, and other organizations to gnash their teeth - and file lawsuits.

Last week, Google and a host of these complainants agreed to a settlement that a court must still approve. Google will contribute piles of cash - $125 million - to settle outstanding issues and fund a new copyright clearinghouse that will enable authors and publishers to receive funds for online viewing of works.

The settlement also clears the way for far greater access to orphaned works: books (and other material) that remain protected by copyright, but which are out of print or out of production, the party owning the rights is nowhere to be found, and the works largely unavailable even through lending libraries.

Unlike the outcome of many lawsuits about copyright and access, this settlement could be a big win for authors, publishers, readers, and libraries. Could such a thing be possible?

(Full disclosure: I am a member of the Authors Guild. Although I did not support the particular form of the Authors Guild lawsuit, neither did I cancel my membership as a result of the legal action.)

[Editors' disclosure: With our Take Control hats on, we've worked with Google Book Search for years, and it pains me to say that the experience has been nothing but frustrating, with literally months of delay between uploading a fully searchable PDF - no need to scan anything - and having it posted. Plus, although Google's support people responded quickly to our queries, they were universally useless at addressing any complaints, such as posting delays or the existence of guaranteed broken links to Amazon.com for our titles, given the fact that Amazon doesn't resell our ebooks. I certainly hope that the settlement will mean increased exposure for Google Book Search and our content, and additional sales. -Adam]

The Backstory -- After a couple years of prep work, Google announced in 2004 its Google Print program, later renamed Google Book Search, as well as its Library Project, the controversial part.

Google started partnering with major publishers first, followed by smaller houses - a total of 20,000 so far - to make their books available in some form online.

Google's bigger objective was to partner with major academic libraries around the world, scan books using high-speed techniques it had invented, and use optical character recognition (OCR) technology to turn the scans into searchable text.

Google Book Search made it possible for anyone to search the contents of any scanned book and, depending on the copyright status of the book and other factors, view or even download some or all pages. (Microsoft started two similar programs which avoided many copyright issues, but the company shut those projects down in May 2008.)

This behavior rankled many because Google claimed the right to scan copyright-protected books because the company wasn't per se distributing the books, even though it had full digital copies. Google maintained - in a rough approximation - that because it was working under contract with libraries that owned physical copies of the books, that making archival digital copies was perfectly legitimate, as was turning the copyrighted works into text and images that weren't revealed in whole on the Web.

The various parties aligned against Google disagreed, and filed suit in 2005.

The Variety of Works under Discussion -- Part of what publishers and the Authors Guild found problematic, and part of how the settlement on which parties agreed was designed, centers around separating works into three categories: public domain, in copyright/out of print, and in copyright/in print.

Public domain works are no longer covered by copyright, and may be used in essentially any form and any fashion. Many publishers, notably Dover, reprint public-domain works in various forms and compendiums. Copyright holders can also release all rights on works they control, placing a creation in the public domain. Google Book Search makes the full text available, including for download.
Books that are in copyright, but out of print, are often called orphan works when the owner of the rights can't be found or there's no clear owner, as when a writer dies without an estate; there are also plenty of books that writers and publishers can't find a way to get back into print or wouldn't consider bringing back into print due to low sales or other complexities. This broad category covers books that are no longer stocked or available from the commercial book trade, although sometimes individual authors buy remainders from a publisher - the last in-stock copies that a publisher was intended to turn into pulp - and sells them through hard effort. The copyright for out-of-print books may be owned by a living person or his or her estate, by a trust, by a publisher, or by a company; or it may be entirely unclear who (if anyone) owns the copyright, which is likely the case for many works created before the 1980s. Out-of-print works make writers cry, because their hard-wrung prose - fiction or non-fiction or reference - is unavailable, even if the market desires it, because the economics of print publishing have until recently put their children in the gutter. Google Book Search makes the full text searchable, with snippets of context presented.
Active books are in copyright and in print. Books that are actively sold by publishers through booksellers or directly, even if they're 30, 40, or 70 years old, fit in this category. Publishers often refer to their frontlist, books that are relatively new and actively promoted, and their backlist, titles still in stock and available, and which may even sell well, but which aren't promoted. The same searching and results are allowed as with out of print titles. (By the way, Amazon's special-order books program, launched at the same time as the bookseller's overall store in 1995, was the first simple way to obtain in-print books that weren't routinely stocked by either bookstores or book distributors. Prior to Amazon, special order books required time and effort on the part of a bookseller, and were often regarded as a giant pain to fulfill.)

These three categories raise the question: what's covered under copyright, anyway? I'm glad you asked.

Copyright's Increasing Longevity -- Copyright law in the United States has been tweaked quite a bit since the right was granted in the Constitution, and because of this, there's quite a bit of complexity involved. The U.S. Copyright Office has a brief explanation, as well as a more extended discussion of terms.

If I can try to boil the discussion down for published works copyrighted in the United States:

Everything copyrighted - registered with the Copyright Office - before 1922 is in the public domain.
Nearly everything registered as under copyright starting in 1922 was under copyright initially for a term of 28 years, which could be renewed on the 28th anniversary through the Copyright Office for another 28 years.
Works registered starting 01-Jan-50 are grandfathered through a variety of rules to extend their copyright with no renewal being required. There are a lot of niceties involved, but this is the general rule.
Any work copyrighted from 01-Jan-78 on is under copyright protection the moment it's created for the author's life plus 70 years, or for 95 years from publication for works owned by a company - so-called "work for hire," in which a work was created by a statutorily defined employee of a firm or institutions, or for which copyright has been transferred by the individual or people involved to a company. No registration is required, but it ensures both a proof of ownership along with the maximum statutory damages (treble!) for successful proof of violation. (Before the Sonny Bono Copyright Term Extension Act of 1998, the duration was 50 years following death or 75 years for works for hire. This was also pejoratively known as the Mickey Mouse Protection Act, because Mickey's appearance in Steamboat Willie would have entered the public domain in 2000.)

A lot more explanation, which I'll avoid here, is necessary for rules surrounding other countries' copyright regulations prior to general international agreement in the 1970s about copyright terms, and rules in the United States for anonymous, pseudonymous, and unpublished works.

If you read this carefully, you'll notice a gap. If a work was registered starting in 1922 and before 1950, it would wind up in the public domain if a renewal notice were not filed. It's unclear how many hundreds of thousands or millions of works may have fallen into that gap.

But you can see that there's a giant divide. Before 1922, essentially everything. After 1922, nothing that anyone paid attention to.

Fair Use -- Copyright law contains a giant set of exemptions that are supposed to balance the U.S. Constitution's language against the public good. Article 1, Section 8, states that Congress shall have power "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries."

Many arguments have been made about what limited times means - Stanford law professor Larry Lessig argued the Sonny Bono Act all the way to the Supreme Court - but the idea that copyright is intended not solely for the benefit of "authors and inventors" but for society as a whole should be undisputed. (If you've followed the actions of the movie and recording industries, and legislative efforts to support their actions, you might believe that copyright is all about ownership, not public good.)

In that spirit, Congress defined exceptions to copyright, including fair use, which have further been refined by practice and the courts. There's a quadripartite test when a claimed fair use is examined: the commercial nature or lack thereof; the kind of work involved; the quantity of work used in relation to the original; the effect on the market of the original work. The test doesn't require every element to be met, but each part to be evaluated against the whole. (You can read about this in more depth at the Copyright Office.)

Google has argued that its efforts at scanning copyrighted books and making them available for search with only snippets of results meet the smell test: Google was making no specific commercial return on its book search (in fact, investing tens of millions into its library-scanning efforts with libraries), that the works were intended for public distribution, that snippets were infinitesimal parts of books, and that the search giant was stimulating demand for the books it provided results against. Google provided links to purchase the books, and could thus track sales, too.

The Authors Guild, among others, stated that simply the act of creating electronic editions that were stored and distributed, required permission from copyright holders, much less displaying the results. With a little programming work, an interested party could extract passages or entire books, too.

Without being a lawyer specializing in this area, I believe it was and remains impossible to determine whether Google or its one-time opponents would have prevailed. They clearly would have created a new sub-area of law, either affirming, denying, or making far more complicated the notion of whether creating and owning copies of copyrighted works were de facto violations of the law.

But these one-time opponents are now at least somewhat supportive of Google's efforts. What changed? Quite a lot, and in ways that all parties, and we readers, stand to benefit.

Out-of-Print Books and Book Rights Registry -- The settlement opens the way to allowing vastly improved availability of in-copyright books by separating out-of-print and in-print books into their respective categories, and collecting fees for all snippet displays, page reading, and page printing.

Publishers, authors, and other copyright holders will be able to opt-out of having out-of-print books included; by default, all out-of-print books will be available, but parties can opt out. For in-print books, those who own the rights will opt in. This allows all of Google's existing partners to continue what they're doing, and publishers to experiment by adding specific titles or simply adding their entire catalogs.

If I read the settlement right, publishers who do not opt in to allow in-print titles to be included by Google will simply have their works removed if available or not added in the future. (A complete set of links to resources can be found at the Authors Guild site.)

Where this agreement goes far beyond Google's current program, making it a win for Google, is that Google will now be able to provide not just snippet results, but entire pages or books (for viewing and printing).

Google would collect the fees and pass them on to the Book Rights Registry, which will be run by a board of authors and publishers, and be founded with $34.5 million of a $125 million settlement that Google has agreed to pay - without admitting that any of Google's claims are invalid.

Authors and publishers win by suddenly having a mechanism to disseminate electronic editions while collecting for per-snippet, per-page viewing, and per-page printing. Google has agreed to a 63-37 split in favor of the copyright holder.

The public wins because the settlement calls for a free subscription license for "designated" computers at all U.S. public and academic libraries - a miserly 1 per public library building or either 1 per 4,000 or 1 per 10,000 students, depending on the institution type. Google has also agreed to pay all printing royalty fees for 5 years or up to $3 million, whichever comes first, for these qualifying locations.

Other institutions can pay for overarching printing and reading licenses, and public libraries can upgrade to fuller licenses, too. Without knowing what these more extensive subscriptions cost, it's hard to know whether public libraries will be able to afford them. Wade Roush of Xconomy, from whose writing I learned about the limits on free library access, is down on the whole deal, partly due to the scale of free access and partly due to the default pricing that Google will set on out-of-print, in-copyright books.

Anyone who researches a topic should benefit from the availability of out-of-print works, as they comprise many millions of titles that are rarely available in wide circulation. Ten libraries around the world might have a particular book you need, but that doesn't mean you can gain access to it.

Google has also agreed to pay legal fees, and at least $45 million to copyright holders whose works were scanned before a certain date connected to the lawsuit.

Now, of course, not all publishers or copyright holders are represented by the parties involved, and some may choose to sue separately in the future. The court might also require the parties to appear in court, although courts prefer settlements.

The only fly in the ointment is that copyright holders of out-of-print but in-copyright works are being de facto opted in to having their works available by virtue of this settlement, even if they're not party to it. That should fly, because most of these creators or owners can get no value out of their works at present, and few people complain about receiving additional compensation. Further, the creation of a clearinghouse gives a kind of imprimatur, allowing a party that represents authors and publishers to make sure out-of-print works see life again.

There was the notable case in the music world of James Carter, a former convict whose voice was recorded on a chain gang in 1959 by pioneering folk music collector Alan Lomax. In 2002, the song he sang, "Po' Lazarus," was used in the opening of the movie "O Brother, Where Art Thou?" The soundtrack sold 4 million copies.

Carter, who left prison in 1967 and had led a quiet life since, was tracked down after months of research by the Lomax archives, and presented with a $20,000 check; he received $100,000 by his death in 2003.

Avoiding Collision with the Future -- I'm a writer. I make my living by sitting down and typing, as I am now. The notion of Google appropriating my words without my permission or acknowledgment always bothered me, even though I also accepted that there was a fine chance that the company was operating within the legal constraints of copyright law.

I similarly was troubled by the Authors Guild partnering with what is often its natural enemy, the AAP, in trying to prevent Google from related activities, some of which seemed to benefit me and authors, and others of which did not. (For instance, the AAP at times has suggested that public libraries should pay fees to publishers when they lend works. While this is the case in EU nations, authors generally don't believe that publishers would pass along these fees to authors; that's separate from the seemingly un-American idea that public libraries pay royalties!)

This reconciliation doesn't solve all issues, but it makes it much more likely that independent authors and publishers survive and even thrive by providing a broader marketplace, while also providing greater availability of human knowledge. While the ease of access to publicly promulgated information, like Web pages, has increased, trends seemed to suggest that books would go down the path that movies are still taking and music is slowly escaping from: being available only in highly restricted ways that interfere with technological progress.

With this new agreement in place, it's possible that you could publish a book, distribute it entirely through Google Book Search, and earn some money - maybe even a lot of money if the book goes viral - and bypass publishers entirely. That was the promise of the Internet music, blog, and podcast revolutions, too. While it hasn't come true for everyone, it's certain that many more voices are being heard by many more people around the world. And that's a good thing.

[Note: This article was edited to clarify the difference between in-copyright, out-of-print works which are orphaned - no copyright owner is known or can be found - and those that are not.]

Article 2 of 5 in series

Entertainment | 02 Mar 2009 | Print

Why the Kindle 2 Should Speak When Permitted To

by Glenn Fleishman

The Authors Guild wants its members to be able to choose which electronic rights they grant to their works. As a result, the Guild is painted as a villain for apparently suggesting parents shouldn't read books to their children without paying fees. It's all about revenue, permission, and closed systems. Oh, and the Amazon Kindle 2.Show full article

Hide full article

The Authors Guild isn't a league of those opposed to readers reading more. I'm a member, and I'm no supervillain. Rather, the Authors Guild is fighting a multi-decade rearguard action to prevent the erosion of authors' rights and royalties. Readers are getting caught in the crossfire as worlds collide: publishers, booksellers, Google, and authors.

The latest salvo popped up over Amazon's Kindle 2. The revised electronic book reader includes a much-advertised text-to-speech option that reads material aloud in a fairly decent quality computer voice. You wouldn't want to listen to it all day, except perhaps on a long flight or car ride, but it's plenty good to get through reading a newspaper on a morning commute.

The trouble is that Amazon didn't purchase the audio performance rights for the works the company sells on Kindle. Historically, rights for audiobooks, dramatic performance, and other renditions of a printed work are separately reserved and discussed in author contracts. An author may sell one form of work to one publisher, and sell other rights in other ways, or retain them for their own distribution. A publisher might buy a bunch of rights and separately sell them to other groups.

The Authors Guild said it was looking into this topic, and suggested its members make sure their contracts were in order regarding what rights they assigned for electronic books. Executive director Paul Aiken maladroitly told the Wall Street Journal, "They don't have the right to read a book out loud."

This resulted in the Guild being immediately lambasted in a variety of specious ways by essentially well-meaning folks. Some claimed that the Authors Guild was asserting reading a kid's book aloud at bedtime was "public performance" and would require a fee. It just shows you how hot tempers flare. In context, Aiken was referring specifically to the Kindle 2's feature.

Amazon pulled back on text-to-speech reading after only a few days, stating that the Kindle 2 will be modified to allow a per-title permission for text-to-speech reading based on a publisher's preference. That move didn't silence the debate, though.

While some may accuse the Guild of trying to extract money from unsuspecting users, I find that Amazon's position is as both a licenser and an enabler for users. If publishers want to sell content to readers directly for a range of devices that can all read the same format or formats, that's very different from Amazon's closed store, direct rights' purchase, and bundling of different kinds of features into one device.

Braille Kindle? -- An organization representing visually impaired Americans, the National Federation for the Blind, assailed the Guild's statements, noting that the Guild was "advising its members to consider negotiating contracts prohibiting e-books to be read aloud by the new Amazon Kindle 2." Not quite. The Guild was suggesting that as long as Amazon was buying one right and selling others, that Guild members should explicitly reserve this right. In essence, the Guild wants Amazon to pay for (or at least have) the rights they're selling.

The NFB's criticism was odd in other ways. The right for assistive devices to allow visually impaired people to access information protected by copyright without violating rightholders' interest is enshrined in federal law. Plus, the NFB failed to note that the Kindle 2 isn't designed to be accessible by the visually impaired; the group might talk to Amazon about that.

(And, by the way, if a special Kindle 2 were issued to those without sight, federal law would allow the device to speak the text, regardless of rights, just as existing reader software and hardware does. Radio stations operate reading services for the blind on special subcarrier stations, which require a tuner that, by law, you have to be certified to obtain and use. Readers speak aloud newspapers, magazines, and other materials.)

The DRM Argument -- People opposed to digital rights management, which locks up media preventing personal use outside of devices that a given DRM-protected seller authorizes, took exception in another way, saying that the Guild was trying to buy into the notion that users aren't allowed to do what they like with media they own.

But I'd argue that's not the issue with Amazon and the Kindle 2. Authors and creators should have the right to control how their works are disseminated and transformed. The issue with DRM is whether you can "legislate" that control, through encryption, closed systems, and laws that prevent breaking the encryption; or whether explicit licenses should be enforced through more typical means, like lawsuits and restraining orders and so forth for violations of copyright.

I've long argued that DRM was an unnecessary measure because it's so easy to circumvent. If you can watch, read, or listen to something, you can get around the DRM. The issue was how people, owning unlocked material, would misuse it or not. The explosion in unprotected music and the transition of the entire digital music industry to DRM-free sales shows that giving people the option to play music on any device they own doesn't result in all music being free everywhere and never purchased again. (Essentially all music is already freely downloadable, thus it was already a de facto situation.)

However, making copies and playing music on multiple devices doesn't transform the music. Let's take it a step further. What if you wanted to transform the music you purchased? It's in MP3 or AAC format, and there are plenty of audio editors that would let you mash-up, change the tempo, or overlay tracks. You might even scrub the vocals, and run a song underneath a video you created and uploaded to YouTube.

That's all well and good. The software you would use for this purpose is general-purpose software that's not marketed for the purposes of creating new music rights. Until you upload to YouTube, you're making a new work for personal purposes, and there's little or no law that could reach out and touch you for that. (Once you upload to YouTube, the video could be removed because of the audio you used, if someone notices and cares, but you'll likely face no legal repercussions.)

The company that sold you the digital music track isn't marketing the track or delivering it to you in a way that allows or encourages some kind of transformational power, nor does it lock the playback to a software that they distribute.

Contrast this with Amazon and the Kindle 2. Amazon is purchasing a license from a publisher or author to sell the right to read a book in digital form. Amazon uses DRM, so Amazon controls the playback, locking reading to a Kindle 2's screen. When Amazon marketed the Kindle 2 initially, one of the ballyhooed features was text-to-speech. People were buying the device and downloading books and other media with some expectation that they could get audio versions, however clunky, of what they bought.

This is the crux of the matter. As Authors Guild president Roy Blount, Jr., wrote in a New York Times op-ed piece, "What the guild is asserting is that authors have a right to a fair share of the value that audio adds to Kindle 2's version of books."

Blount and the Guild aren't asking users to pay an audio tax, nor are they asking users to forego text-to-speech. The question is whether authors get paid for this right, which they've traditionally owned, or if this isn't a new right at all.

A Twist on the Situation -- Here's where I may diverge from the Guild's position in the future. I would argue that if Amazon were selling the Kindle 2 as a generic text viewing device that could display PDFs, text files, and other documents, and that also had a text-to-speech capability, but wasn't specifically operating a bookstore and marketing the audio feature, there wouldn't be a debate here.

If you could purchase or otherwise acquire books and documents from a variety of sources, I can't see how any sensible party could argue that text-to-speech reading of works on that device constituted anything but a personal transformation of the work. (The Guild hasn't asked, for instance, that text-to-speech be disabled for your own documents that you transfer via USB or via Amazon's fee-based wireless service.)

In that case, the user chose to listen, the device's seller facilitated that without being engaged in rights purchase or management, and the text's rightsholder sold the work in that format with the full knowledge that individuals might do things with the work for their own purposes. They might, in fact, charge more for the work because of the expectation that it would be used in more ways and more widely read.

Back in 2002, I joined a model lawsuit initiated by the Electronic Frontier Foundation along with four other plaintiffs to counter efforts by the film and television industry to restrict the time-shifting and space-shifting features of the ReplayTV.

The issue there was that the 29 defendants had sued the firm that made ReplayTV, a digital video recorder like a TiVo, because the company had embedded an intelligent commercial skipping algorithm, and a method of sending recorded programs among a set of registered devices. One media executive made the claim at the time that viewers were required to watch commercials, although bathroom breaks were permissible.

My defense of ReplayTV might seem at odds with my support of the Authors Guild. However, so far as the Guild has made its case to date, I'm entirely consistent.

ReplayTV was designed for a variety of purposes, some of which could potentially infringe on a creator's rights, but most of which did not. Likewise, the recording industry made a few lunges at the iPod in the first few years after Apple released the player, stating boldly that the device was almost entirely focused on infringing uses. That argument didn't hold.

The Kindle 2 is much like the iPod. However, the particular feature that the Guild has singled out is designed only for infringing purposes, if you buy the argument that the user isn't violating the law in using it, but rather that Amazon is tiptoeing around obtaining a license for a purpose other than the one that they told publishers the Kindle 2 was going to employ.

(I may be wrong here about the specifics involving Amazon contracts. I have heard nothing from publishers that said that Amazon explained the text-to-speech feature before the Kindle 2 was formally launched. Amazon may also have the right to use text-to-speech in some explicit fashion in some of their contracts.)

This would be much like if you bought a ReplayTV that had a number of ways to record programs, delete the commercials while recording broadcasts, send them to different devices on your home network, and act as a peer-to-peer distribution client for any program on its hard drive.

You'd have no support from me that the first three tasks (recording, stripping ads, or personal use space-shifting) were illegal. The commercial-stripping option might have a case for or against legality, but if it's set up clearly as something a user has to trigger and choose, no dice. However, the last use, uploading files for sharing, would clearly be out of contention even if the rest of the device were just fine.

As an author, I have no interest in withholding any rights that get people to have more access to my work. As a reader, watcher, or listener, I want the most flexibility in how I use the media I've purchased in any form. As a technology geek, I like the expansion of anything that lets people be more creative and be exposed to new ideas in interesting ways.

The resolution isn't authors and publishers blocking rights, Amazon turning off features, or users turning away from the Kindle. Rather, it's about allowing rightsholders to have a say in how their work is used when they license that work for a particular purpose.

Harkening Back to Google Book Search Settlement -- A similar set of issues came up over the last few years in an Authors Guild lawsuit, in which the Association of American Publishers and other parties were involved in suing Google over the Google Book Search program. Google was scanning millions of books, performing optical character recognition on the results, and making limited search features available. (See "Authors and Publishers Settle with Google Book Search," 2008-10-29, for more background.)

Google said that merely scanning and storing complete images of in-copyright works didn't violate any rights, nor did presenting "fair-use" snippets of results from the books. Google's stance was that this in fact encouraged sales. The Guild and its co-plaintiffs argued that there were no rights at all for Google's behavior.

To settle the lawsuit, Google agreed to pay tens of millions of dollars to authors and publishers for books scanned, open the way to make available millions of more books that are in a limbo state, and to provide fully agreed-on, compensated access to millions of in-print books.

This seemed like the optimal outcome, especially for readers, who gain access to a massively larger pile of reading material, and researchers for whom books locked in stacks will suddenly be available on their desktops. Authors and publishers get paid. Google makes money from ads and sales commissions.

To bring this back to the Kindle situation, the lesson from the Google Book Search settlement is that each author, or each company to which an author has assigned rights, is always making choices about derivative works. The specific case here with the Kindle 2 is that Amazon purchased one right and is trying to gain revenue by selling implicit additional rights.

I don't know whether, if this Kindle matter had reached the courts, the Guild's position or Amazon's would have prevailed. Julian Sanchez at Ars Technica examined the legal position quite evenhandedly.

It's entirely possible that Amazon would have prevailed, and it's likely that this would have cast a pall over electronic books, because audiobooks represent a $1 billion per year industry - one that Amazon participates in directly, thanks to its purchase of Audible, the leading online audiobook seller. (Amazon's ownership of Audible, which has purchased audio rights for all the books it sells, makes Amazon's initial position a bit bizarre, in fact.) Publishers might have chosen not to license works at all on devices that had text-to-speech capability to avoid giving up one lucrative market in favor of a currently unproven one.

Amazon only acknowledged that it might be better to put authors and publishers in the driver's seat when it comes to choosing which books can be read aloud by the Kindle 2 and which cannot. This move, without Amazon giving up its position that it had the legal right for text-to-speech reading, still has the effect of making creators comfortable with the notion that as new devices appear, they can expect to talk with device makers and content sellers about what text can and can't do in this new world.

For now, it seems to me that technology, readers, authors, and Amazon are all well-served. While sites like the Consumerist ran headlines such as "Amazon Allows Publishers to Kill Text To Speech Function on Kindle 2," the reality is that authors and publishers want to have works widely distributed and read, but to have the choice to embrace all technology or some of it - and be paid for that which they embrace.

Article 3 of 5 in series

Tech News | 23 Mar 2009 | Listen

| Print

Sony Reader Gets 500,000 Free Public Domain Titles from Google

by Glenn Fleishman

Google tries to insert itself into the electronic reader market by making 500,000 copyright-free titles available for the Sony Reader Digital Book. Titles, all dating from before 1923, are free to download.Show full article

Hide full article

Google is further exposing some of the 7 million books it has scanned from academic collections by making 500,000 titles with no remaining copyright protection available to Sony for its electronic book device, the Reader Digital Book. Reports indicate that only books from 1922 or earlier are included, as 1922 is the latest date for which public domain status is entirely clear. (Many works published after 1922 are also in the public domain, but each work must be researched individually to determine its status.)

Earlier this year, Google added an option to view but not download 2 million public domain books on the iPhone; see "More Ebooks Available for the iPhone/iPod touch," 2009-02-09. That's more like a Pandora stream than an iTunes song purchase.

Google's program to scan books ran afoul of publishers' and authors' concerns about the right to scan and archive titles, and the legality of snippets being displayed from these scanned works. A preliminary settlement between Google and various interested parties should make millions of books available for viewing, printing, download, and purchase in the coming months; these titles could also wind up being available for the Reader Digital Book. (See "Authors and Publishers Settle with Google Book Search," 2008-10-29.)

It would seem that Google has chosen to side with Sony instead of Amazon in the nascent ebook reader world. The Wall Street Journal notes Sony said its Reader Digital Book sales are at 400,000 and reported that Citigroup estimated Amazon Kindle sales at 500,000. That sales level seems quite good for a new category of consumer device, but it's nowhere close to the 17 million iPhones and 13 million iPod touches that Apple has sold so far over a similar period. (The original iPhone and Sony Reader were both introduced on the same day in June 2007, the iPod touch in September 2007, and the Amazon Kindle in November 2007.)

The Kindle 2, introduced in February 2009, improves on the design of the original device and has a faster screen refresh ("Kindle 2 Improves Design, Not Features," 2009-02-26). Amazon released Kindle for iPhone shortly after the Kindle 2 hardware ("Amazon Releases Kindle Software for iPhone," 2009-03-03). Amazon offers 245,000 books for sale along with subscriptions to dozens of magazines and newspapers, and hundreds of blogs. The iPhone software can download only books, not subscriptions. That may change with Apple's iPhone 3.0 software, which will enable in-application subscriptions and purchases ("Apple Previews iPhone 3.0 Software," 2009-03-17).

Lest we forget, the volunteers of Project Gutenberg have been assiduously typing, scanning, and correcting out-of-copyright works for many years. Project Gutenberg's catalog, now containing over 28,000 books, includes downloads in text and other formats, including a DRM-free ePub format that both the Reader Digital Book and Kindle 2 can handle. Affiliated and partner projects bring Project Gutenberg's grand total to 100,000 titles.

While Project Gutenberg has a fraction of what Google has made available, the quality should be higher, as works have been prepared for accuracy instead of volume, and represent works more likely to be interesting to a modern audience than just historians and researchers.

Article 4 of 5 in series

Tech News | 07 Sep 2009 | Print

| Comments (2)

Google Books Settlement Hits Snags

by Glenn Fleishman

The proposed settlement between Google and groups representing authors and publishers over Google's past work in scanning in-copyright titles may be scuttled over the advantages that such a settlement would confer on the search giant.Show full article

Hide full article

Almost a year ago, Google seemed to have hammered out a settlement about its prior scanning of books that were covered by copyright as part of its Google Books effort. Google had been sued by individuals and groups representing authors and publishers. The settlement would have established payments, a clearinghouse for handling who gets what, and fees for book portions already viewed. I spelled out what all this means in "Authors and Publishers Settle with Google Book Search," 2008-10-29.

At the time, I thought the settlement could be a win for all the parties involved because copyright holders would establish a right of control over digitizing copies of their works, and Google would widely disseminate many books that are otherwise unavailable without significant effort. Authors and publishers would accrue additional incremental revenue, earning money from page views and downloads, but also likely additional print book sales.

But three big issues have emerged in the intervening months that might scuttle the arrangement, and I might support the settlement collapsing.

Google Adopts Orphaned Works That Actually Have Parents -- The settlement gives Google what is essentially a court-provided license that allows the firm to scan all books without explicit opt-in permission from the authors and publishers who own the rights. Google can also make snippets and downloads available, while charging fees, collecting its share, and paying the remainder to a book rights clearinghouse.

Authors and publishers involved in the settlement may have an option to include or exclude specific works or their entire oeuvres or catalogs. However, for the vast majority of books covered, those who own rights will have to opt out explicitly, if they so wish. (There are such compulsory licenses carved out of copyright law by Congress that require copyright holders to allow certain works, but those licenses are limited and narrow.)

This includes all orphaned works, which are out-of-print titles for which the rights holders are not readily known even though the works remain under copyright protection.

Marybeth Peters, the Register of Copyrights, a job I was unaware existed until a few days ago, said in testimony before the U.S. House of Representatives Committee on the Judiciary on 10-Sept-2009 (the testimony is all available in PDF form):

Although Google is a commercial entity, acting for a primary purpose of commercial gain, the settlement absolves Google of the need to search for the rights holders or obtain their prior consent and provides a complete release from liability.

While a court can't forestall any lawsuits by authors and publishers who claim such orphaned (or just ignored) works, the settlement would make it far more difficult for any individual or small publisher to sue Google for scanning and selling works without permission.

Exclusive Rights to Scanned Works -- If the settlement is approved, then Google winds up lucking into a deal that would be impossible otherwise: getting an agreement with thousands of parties all at once, and receiving a copyright exemption - albeit from a court that wouldn't seem to have the right to grant such an exemption.

As an Amazon exec, Paul Misener, noted at the congressional hearings noted above,

If a potential competitor to Google engaged in "massive copyright infringement" in the hopes of getting sued by the same plaintiffs in the Google litigation and making the same settlement deal, why would the rightsholders settle on the same terms when they already have a distribution partner and would stand a reasonable chance of obtaining massive statutory damages?

Even massive firms like Microsoft, Sony, or Amazon would have to work on thousands or tens of thousands of individual deals to achieve similar results - without the inclusion of orphaned works. Microsoft, lest we forget, had its own massive book-scanning project at one time, but worked nearly entirely with titles that fell outside copyright protection, or where the issues were at least murky, providing them some cover. And even still, Microsoft dropped the project, probably because the works it scanned just didn't offer enough potential value given the high cost of scanning and processing.

The exclusivity means that Google would have de facto ownership of the idea of turning printed books into digitally available versions, offering a library far larger than any potential competitor could create.

The settlement imposes pricing tiers and structures on books offered for searching and download, with the clearinghouse in charge of that. But that puts control of pricing in the hands of the clearinghouse board, which will be dominated by authors and publishers, and who may use Google's monopoly to set artificially high non-competitive prices. That's not good for readers, nor for authors and publishers who want alternatives for their works while still making them widely available. (Amazon's Misener described the clearinghouse as "a cartel of rightsholders that, for sales of books to consumers, would set prices to maximize revenues to cartel members.")

Copyright Settlement without Borders -- Finally, this settlement applies just in the United States, and there's a simmering and growing anti-settlement sentiment emerging outside America. Many authors based outside the United States sell book rights to U.S. publishers. The settlement would ostensibly sweep books published under those rights into an agreement that non-U.S. authors had no part in shaping.

Google might feel justified in scanning and offering books that are orphaned and out of print in the United States, even if the rights holders were reachable elsewhere in the world, but not through U.S. agents. Google has now pledged not to include any books that are in print in Europe, but that's likely not enough.

Dreaming of an Opt-In Future -- In the little cloud kingdom in which I apparently live, the logical course would be for authors and publishers to choose a different route. Instead of allowing Google to own the digital scans the company makes and have all decisions about dissemination, the settlement should allow Google and any other parties to scan as much as they want, but the ownership of the scans would remain in the hands of the copyright holders and held in trust (for those that opt in) by a clearinghouse. Whoever scanned the works would receive some kind of compensation or royalty to offset the work performed, otherwise there would be no incentive to continue to scan.

While this might seem insane for Google to hand off its hard-wrought work, the original lawsuit was partly over whether Google had the right at all to scan works to which it lacked copyright, extract text and images, and make snippets available for searching over the Internet. That right was never established, and a settlement could involve a transfer of ownership, while Google would retain its own copies and all the work performed. Google still gets first-mover advantage, and it doesn't have to delete any work it has carried out to date.

The process should also be opt-in, requiring copyright holders to agree to participation. The issue of orphan works is much larger than Google Books. The U.S. Copyright Office and its Register, Peters, has backed legislation that would allow good-faith use of orphaned works with a clear process in place. Requiring opt-out registration wouldn't suffice.

As Peters said in her congressional testimony, "Under copyright law, out-of-print works enjoy the same legal protection as in-print works. To allow a commercial entity to sell such works without consent is an end-run around copyright law as we know it."

Again, a clearinghouse that was the repository of digital copies could become the central authority for making good-faith efforts to reach authors and others. Peters said, "a compulsory license for the systematic scanning of books on a mass scale is an interesting proposition that might merit Congressional consideration."

This approach would allow any party with sufficient resources to register and pay reasonable cost-recovery fees to acquire the original scans for later OCR or other kinds of processing, such as image extraction or even typographical analysis. It wouldn't be fair nor sensible to require firms to hand over both scans and converted text, as the OCR will be part of the value added, and represent a fair amount of cost.

Different for-profit and non-profit organizations might carve out parts of a large collection, and negotiate different terms for payment based on lending, rental, and sale models, instead of a one-size-fits-all payment model.

The clearinghouse, too, should be an independent non-profit with interests of commercial firms, authors, publishers, libraries, and academic institutions represented; the current clearinghouse planned, while a non-profit organization, is focused on stakeholders without a broader sample of representatives who focus on the public good.

Decisions over pricing should be non-discriminatory, but in the best interests of balancing both mass dissemination and the rights holders' desires (whatever it may be) in earnings. Rights holders could set absolute terms or allow the clearinghouse to set terms based on policies.

Days after I wrote the first draft of this story, Google - in the above-mentioned Congressional hearing on digital bookselling competition - offered the ability to resell any of the orphaned works it's scanned to its competitors, or anyone. Not quite what I envisioned, but an interesting offer. Amazon was dismissive about the offer, Wired reports.

Will any of my blue-sky ideas come to pass? It's hard to tell. The Google settlement will likely not proceed to conclusion in its current form; or, in the event a judge approves, will be subject to additional lawsuits and regulatory involvement worldwide.

The fact that tens of millions of pages of human thought remains so inaccessible seems a crime, but any decision made has to look at the widest benefit to all the parties involved, not just anointing Google king and authors and publishers as a council of nobles.

[Disclosure: The Authors Guild is a party to the settlement, and I have been a member of the guild for several years. Because of a number of the guild's recent actions, including its support for this settlement, I have chosen to let my membership expire this year.]

Article 5 of 5 in series

External Links | 28 Jan 2010 | Print

New Google Books Settlement Fails to Placate Prominent Critics

by Glenn Fleishman

The latest revision to the Google Books settlement, an ongoing saga we've written about regularly here on TidBITS, is still opposed by Amazon.com and the Internet Archive, among others. The settlement in this revised version would still anoint Google with court approval as the only party in the United States that can scan and offer for sale copyrighted works that are out of print and for which the publisher isn't known.Show full article

Show the full text of all articles

TidBITS Watchlist

Find Text Leading from Acrobat PDF

Recent TidBITS Talk Discussions

Series: Google Books Settlement

Authors and Publishers Settle with Google Book Search

Why the Kindle 2 Should Speak When Permitted To

Sony Reader Gets 500,000 Free Public Domain Titles from Google

Google Books Settlement Hits Snags

New Google Books Settlement Fails to Placate Prominent Critics