Google Book Search… Indexing History
Apr 9th, 2009 | Category: Articles, Featured ArticlesWritten by: Matt Schroettnig
Researched by: Darci G. Van Duzer
Edited by: Eric Wasik
Managing Editor: Amy E. Seely
Is that seven million books in your pocket… or are you just happy to see me? Imagine a world in which the Library of Alexandria survived, intact, to the present day.
Charged with collecting all of the world’s knowledge, the library was legendary, and its loss to the world catastrophic. Though the ultimate date of its destruction remains in constant debate, there’s a good reason for it–as Neal Stephenson put it: “It’s inherently difficult to get reliable information about an event that consisted of the destruction of all recorded information.”
The loss of the entirety of recorded human knowledge is mind-boggling to say the least. How can we possibly prevent such a tragedy in the present day? When faced with such a potentially life-altering question, I do what any sensible adult would: I Google it. Enter the Google Books Library Project (GLP), a massive undertaking wherein Google is attempting to digitize and index nearly every printed work known to man.
Library of Alexandria 2.0?
On October 28, 2008, Google announced that they had made digitized copies of more than seven million books searchable through Google Book Search (GBS), with plans to add many more. Through partnerships with many of the world’s largest universities, GBS aims to digitize their collections, index them accordingly, and ultimately make them searchable online. However, due to a slight obstacle known as “copyright law,” a significant number of authors, publishers, and publishing houses weren’t thrilled about the idea. True to (American) form, in the absence of legislation arises litigation. Enter Authors Guild v. Google, a class-action suit filed on September 20, 2005, by the Authors Guild joined by select authors, in addition to a suit filed on October 19, 2005, by five members of the Association of American Publishers (AAP): The McGraw-Hill Companies, Inc.; Pearson Education, Inc.; Penguin Group (USA) Inc.; John Wiley & Sons, Inc.; and Simon & Schuster, Inc.
Simply put, the Authors Guild was accusing Google of massive copyright infringement, based on their producing copies of works not within the public domain without the express permission of the copyright holders. Further, the AAP believed that Google stood to earn millions of dollars “freeloading on the talent and property” of their authors and publishers. These incendiary claims led to some harsh and lengthy discussions among the various parties, and over three years later we are still awaiting the Court’s final decision.
When elephants fight, the grass suffers.
On October 28, 2008, following two years of settlement negotiations, Google, the Author’s Guild, and the Association of American Publishers happily announced the settlement of the litigation in a measly 134-page agreement. This agreement, which remains subject to approval by the US District Court for the Southern District of New York, allows Google to continue scanning books still under copyright (or “in-copyright” books) into its database. In addition, Google may enable users to search the full content of said books. Further, the agreement stipulates that Google will provide $34.5 million to fund the creation of the Book Rights Registry (BRR), a mechanism that allows Google to compensate the copyright holders for the right to display their works. Google will also fund an additional $45 million for distribution through the BRR to those copyright holders whose works were scanned prior to January 5, 2009. By providing the search function, Google plans to generate revenue from both advertising and selling end-users the right to see the full text of the indexed works. Pricing will be determined every two to three years via agreement between both Google and the BRR. Prices must be set according to the terms of the agreement as well, or “to reach as many customers as possible.” Of the resulting revenue, Google will retain 37%, and remit the other 63% to the BRR.
The seven million works reportedly digitized by November 2008 include approximately one million already within the public domain, one million in copyright and in print, and roughly five million in copyright but out of print. Many of the current issues arise within the latter category as these books are ones that both the publisher and author have essentially abandoned, known as “orphan” books. Consider for a moment that if neither the author nor the publisher can be located, who receives the royalties?
Please, Sir… Can I have some more?
Google currently indexes a great majority of the information found on the Internet, but they clearly want more. However, there’s a slight problem–in order to determine whether a book has indeed been “orphaned,” the copyright holder must first be found. Due to the terms of the agreement, Google must make an effort the Court finds both “reasonable and practicable” to locate authors and publishers, particularly those of purported “orphan” works. The simple rationale is that if the works are in copyright, and Google is benefiting from the commercialization of the work, it’s only fair that royalties are paid accordingly. That becomes impossible, however, if no one has a clue who should actually get the check. Additionally, under the terms of the proposed agreement, unless a party specifically opts out of the creation of the digital doppelganger by May 5, 2009, they are automatically opting in. In the meantime, the “reasonable and practicable” effort of Google amounts to the considerable task of placing at least one legal notice in every country in the world. No, really.
A growing chorus, including Microsoft, fears that Google is obtaining monopolistic control over millions of orphaned works. With no one to pay royalties to, critics fear that through their “opt-in” program, Google is becoming the sole party able to profit from the prolific dissemination. While the great majority of these orphaned works are likely to be of little value individually, collectively they represent a “broad swath” of 20th century literature and scholarship, and as such will likely prove invaluable. Taking into account advances in on-demand publishing, and the potential of hand-held readers, Google is, in essence, creating a massive digital library of other people’s literary works from which only it can profit.
The Court is permitting objections to be filed through May 5, and plans a hearing for June 11. Of particular interest are the concerns raised by attorney Daniel Kornstein, representing New York Law School’s Institute for Information Law and Policy, who wrote a letter to the court asking permission to file an amicus curiae brief. Therein, Kornstein noted that the brief would address a number of concerns, in addition to requesting that the Court “solicit the opinions of the Anti-trust Division of the Department of Justice and the Federal Trade Commission.” While it is worth noting that Microsoft is funding New York Law School’s challenge here, it does not appear that their claims are without merit.
“Free to All.”
Those three words welcome all visitors from above the main entrance to the Boston Public Library. Article 1, Section 8 of the United States Constitution establishes within our government the right:
To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.”
Authors and inventors are given exclusive right to their creations in limited circumstances and for limited times. Robert Darnton, the Carl H. Pforzheimer University Professor at Harvard, wrote an impassioned and oft-quoted piece related to this very topic. A pioneer in the field of “the history of the book,” his work is appropriately titled “Google & the Future of Books.” As Mr. Darnton noted therein in reference to the above quote, our founding fathers were careful to acknowledge an author’s rights to a fair return on his or her intellectual labors, but it was the public welfare that took precedence over individual profit.
If held subject to the terms of the agreement noted above, Google will provide a single terminal for every U.S. public library with a free “Institutional Subscription” to GBS databases. Therein, users can access all public domain works, as well as works that are in-copyright but not commercially available (i.e. the controversial “orphaned” works). And in response to this article’s opening salvo, regrettably you can’t really walk around with seven million books in your pocket; the Google Books iPhone and Android apps currently limit you to 1.5 million.
Though Google famously claims the business model “Do No Evil,” it remains to be seen whether, in this particular endeavor, it will hold true. Further, while Google Books might not yet equal the legend that is the Library of Alexandria, its founders Sergie and Larry have certainly broken ground on a new foundation. And finally, much to my relief, there’s almost no chance of Google’s library burning to the ground. Just watch out for viruses…

Amusing that you got the link right but got the unofficial motto wrong. Many do.
At the link you cite, it currently says “You can make money without doing evil. ”
See also – http://en.wikipedia.org/wiki/Don‘t_be_evil
A motto of “Do No Evil” is not only impractical, it is impossible. To do /anything/ is to risk to some evil. So, dinging a company for not living up to it (especially a company that has not adopted your version of their motto) is doubly fallacious and a little silly.
Cheers
Pyegar,
Touche’. Though I would like to assert that I in no way intended to “ding” Google. It is my opinion that Google is merely filling a necessary intellectual void. I feel that the digitization of all books is not only necessary, but inevitable. In a perfect world, it would have been a project for a newly created digital branch of the Library of Congress… but the world in which we live is far from perfect.
Ensuring the availability of the entirety of our cultural history for future generations should be of universal concern, and thus should have been a legislative issue.
That said, in true American fashion a corporation has stepped up to do the deed, and in the absence of prescient legislation, litigation results.
My fear at this point is censorship… see today’s /. post (http://yro.slashdot.org/article.pl?sid=09/04/28/1613214) wherein Google plans to remove “inappropriate” books from their digital library. I don’t think I’m alone in stating I didn’t like my government, school, pastor, or parents telling me what was “appropriate” to read… and I really don’t like it when a multinational corporation presumes to do the same.
Thanks for your comment!
Cheers,
Matt
FYI- Updated: http://news.cnet.com/8301-30684_3-10455385-265.html