Case study: Google Books and Privacy

“The EFF has long believed that one of the things that makes us able to participate so meaningfully in these conversations is that we have real technologists on staff. We have trust, and we understand each other. We hire people specifically for their ability to explain technology to non-technical people.” – Cindy Cohn

Background

One of many high-profile cases where well-known digital rights organization Electronic Frontier Foundation (EFF) participated was Authors Guild, Inc. v. Google Inc. Centering on the Google Books project, the case began when the Author’s Guild—a society of published authors and leading writers—filed a lawsuit on September 20, 2005 alleging copyright infringement by Google. Settlement negotiations centered on copyright issues, but EFF’s team sought to have input into the settlement to protect the privacy of readers using the platform, something that had not been addressed by parties, but was an important element of such a major project focused on access to books. While central arguments in the suit focused on fair use, the case is a demonstration of the strong partnerships between staff technologists and lawyers at EFF, who worked closely together to build a strong campaign around user privacy rights.

Case Card

Name: The Authors Guild, Inc. v. Google Inc.

Court: United States Court of Appeals Second Circuit

Decision date: March 22, 2011

Case Number: 05 Civ. 8136 (DC)

Judgment: https://www.authorsguild.org/wp-content/uploads/2014/12/2011-Mar-AG-v-Google-ASA-Rejected-SDNY.pdf

Issue: Protection of readers’ privacy rights implicated in the Google Books project

Featured Actors

Cindy Cohn | Executive Director at Electronic Frontier Foundation

Peter Eckersley | Chief Computer Scientist at Electronic Frontier Foundation

Seth Schoen | Staff Technologist at Electronic Frontier Foundation

Facts

On December 14, 2004, Google announced that the company would begin scanning millions of books from the collections of leading research libraries to create a comprehensive database that could be searchable online. After each book was digitally scanned, Google extracted a snippet—or machine-readable text—from the scan, and created an index of the text of the book. Google then retained the original scanned image of each book to improve accuracy as technologies would improve.

The Author’s Guild filed a lawsuit in the Southern District of New York in September 2005 for copyright infringement. According to the Guild, Google had not obtained permission from authors for the use of their works, thereby violating copyright law. Parties began settlement negotiations to find a solution outside the courtroom. In October 2008, after 30 months of negotiation, the parties drafted a complex settlement agreement. After the settlement agreement was filed, the judge offered third parties a chance to comment and file amicus briefs.

The settlement sparked numerous objections from various interest groups, who expressed concern about issues like public access and competition. EFF was troubled by the privacy implications for readers using the digital book service.

EFF argued that, through the project, Google would have the ability to collect nearly unlimited data about the activities of users of its Book Search and other programs. Information collected would include users’ search queries, the identity of books a user reads, how long that reader spends reading each book, and even what pages were read. According to EFF, the U.S. had a long-standing tradition of libraries upholding the privacy of visitors and defending library users against government requests for reading histories. Intellectual freedom, courts have said, depends on the ability to read books privately. EFF argued that the Google Books project should be no different.

When negotiations with Google did not elicit the response EFF wished for, the organization—assisted by the American Civil Liberties Union (ACLU) and the Samuelson Law Technology and Public Policy Clinic at the University of California Berkeley School of Law (Samuelson Clinic)—filed an objection to the settlement agreement on behalf of a coalition of authors and publishers, including best-selling novelists such as Michael Chabon and Jonathan Lethem, and technical author Bruce Schneier. The objection, filed in September 2009, urged the Court to reject the proposed settlement unless it was amended to eliminate the chilling effects resulting from a lack of privacy.

Outcome

On March 22 2011, Judge Denny Chin of the United States Court of Appeals for the Second Circuit rejected the Google Books settlement. Though the court focused on issues of copyright as the main reason for rejecting the proposal, Judge Chin briefly mentioned privacy interests of users.

“The privacy concerns are real,” wrote Judge Chin. “I would think that certain additional privacy protections could be incorporated.”

The rejection by the court was considered a victory for digital rights organizations that had argued for stronger privacy safeguards. Executive Director of EFF Cindy Cohn said the partnership between technologists and lawyers during the case presented a strong model of collaboration on a complex legal issue that strengthened the organization overall.

Collaboration

Though EFF did not participate in all the private negotiations between the Authors Guild and Google, direct discussions between EFF and Google began after EFF partnered with the ACLU of Northern California to launch an online campaign that circulated a list of demands for Google to better protect reader privacy.

Given that Google had the technology to log reader search information as users browsed the platform, EFF’s main questions were whether reading habits were safe from fishing expeditions by the government or lawyers in civil cases, whether Google would itself use information about users’ reading history, and how Google would combine information about reading habits with other information gathered about users from its other products.

Given that Google had the technology to log reader search information as users browsed the platform, EFF’s main questions were whether reading habits were safe from fishing expeditions by the government or lawyers in civil cases, whether Google would itself use information about users’ reading history, and how Google would combine information about reading habits with other information gathered about users from its other products.

To accurately parse through these questions, the lawyers required strong technical knowledge. From the beginning, Cohn invited EFF’s in-house technologists Peter Eckersley and Seth Schoen to discuss the issues directly with Google Books lawyers and technologists, along with lawyers from the ACLU. Altogether, the parties scheduled three to four conversations and numerous phone calls. During meetings, all parties would gather technologists and lawyers, including the Google Books engineering director, in a room to discuss how law and technology might function harmoniously to provide adequate protections for reader privacy.

“These meetings really flowed back and forth between the technical issues and legal issues,” said Cohn, who attended the meetings with her technologists. “In order to implement the legal standard necessary to protect people’s privacy, we had to get fairly deep into how Google Books really worked.”

The technologists helped the lawyers understand how the books would be accessed on Google’s servers, whether Google would know what books readers searched for and accessed, whether they could see the pages users read, how long they stayed on each page, what books they read before, and which books they accessed next. Utilizing this information, EFF worked with the Samuelson Clinic and ACLU to identify potential privacy harms emanating from Google collecting this information, including the company’s ability to aggregate and identify users’ political views, sexual orientation or social values. 

Technologists also helped lawyers to understand certain technical limits for proposed solutions. For example, some proposed providing users full anonymity when browsing the site. But Google’s tentative settlement of the copyright dispute allowed users to freely browse 20% of a book. To keep track of the percentage of a book a user has read, the platform needed to track them as they browsed. And when someone purchased access to a full book, the user had to be identifiable in some way. Instead, technologists worked together to propose other solutions, including that Google delete data about users every month, limit the use of watermarks to track users, and ensure that readers using services like Tor and VPNs could also access the site.

Armed with clarification from their technologists, lawyers could further discuss the company’s compliance with the auditing requirements of the settlement agreement, or the requirements for warrants when governmental authorities would request sensitive personal information. When Google published a privacy policy during negotiations, technologists and lawyers were immediately available to analyze it, and point out the policy’s insufficiencies.

“The EFF has long believed that one of the things that makes us able to participate so meaningfully in these conversations is that we have real technologists on staff,” said Cohn. “We have trust, and we understand each other. We hire people specifically for their ability to explain technology to non-technical people.”  

With these synergies at EFF, oftentimes unpredictably creative solutions arose. In the course of his questioning of Google technologists, Eckersley asked whether company websites had the ability to fingerprint or assign a unique identity to each visitor. When Google refused to answer, the team developed Panopticlick, a tool that could collect the specific configuration and version information of a user’s operating system, browser, or plug-ins. Each combination of an OS, browser, or plug-in might slightly differ from others by time zone, language, settings or applications installed.

EFF guessed that users around the world had various combinations of these elements, and could therefore be uniquely identified by a website. Each system was “fingerprintable.” After collecting the configurations and versions of many users, EFF compared the information of users and found that private companies could in fact identify and secretly track a user visiting its site. This information would in turn support legal arguments in the case.

The development of HTTPS Everywhere by EFF technologists—a browser extension that automatically encrypts pages visited by the user—follows a similar origin story. Discussions with Google about default encryption during the legal suit sparked the conception of a technical solution. For both products, the conception of these tools was inspired by conversations between lawyers and Google staff about a technical issue that technologists realized could be solved by the development of a technological product.

Beyond the technology, EFF also utilized its expansive network to locate stakeholders directly affected by the lawsuit, including famous authors and publishers. The team received multiple statements supporting the protection of privacy rights from well-known authors, which built up public support for their goals.

Lessons Learned

EFF is unique in many ways. The organization was conceived with a core criterion of pursuing litigation related to digital issues, and became one of the first to hire an in-house technologist to work directly with lawyers on technical cases. While lawyers at EFF are often more technically-savvy than other lawyers, technologists say they are still consulted often. As the pace of technology quickens, the world is growing more sophisticated, according to Cohn, so having a technologist dedicated to providing explanations to lawyers is fundamental.

In communicating, lawyers and technologists at EFF have found that a shared vocabulary and trusting relationship are instrumental to quicker replies and early prevention of misunderstanding. With time, technologists grow more insightful about how to deliver an explanation when approached by a lawyer. While Schoen used to explain a technology in full detail when approached by a lawyer, he now first gauges a lawyer’s level of technical expertise and instead seeks to resolve the practical question presented by using analogies.

At the same time, when a lawyer mentions a procedural stage of a lawsuit, Schoen now has enough expertise to understand and provide more specific consultation. And when an in-house technologist does not have all the answers, EFF has found that their technologists can often point lawyers to a relevant outside expert.

The synergies are not exclusive to technologists and lawyers; Schoen said he was often called by EFF activists campaigning against or for a certain technology to write and audit the campaign’s description of the issues at play. Verifying language on technical details in outgoing communications and materials helped the organization to build credibility.

Overall, maintaining a technologist on staff is a luxury for most organizations. Of course, for institutions not solely focused on digital issues, the need for a full-time staff member may vary. For those organizations that do choose to bring technologists on staff, the synergies that emerge offer a powerful cooperation for cross-team projects.

Related Catalysts

Table of Contents