How Google was tripped up by a bad search

An apparent error during the discovery stage of its court battle with Oracle could prove costly for Google

In the end it was a search that let Google down.

The company suffered a setback in its patent dispute with Oracle last week when a U.S. judge denied Google's request to keep an internal Google email out of the case record. The email, written by a Google engineer, could suggest to a jury that Google knew it needed a license to use Sun's -- now Oracle's -- Java technology in Android.

Ironically, considering this is Google, organizer of the world's information, the email might never have seen the light of day if the search tools used to identify documents covered by attorney-client privilege had done their job, legal experts said.

The incident also shines a light on an area of technology -- electronic discovery -- that's creating big challenges for lawyers as more communication moves online. And it helps explain why Hewlett-Packard is willing to spend US$10 billion to buy Autonomy, one of the biggest providers of e-discovery software and services.

The Google incident apparently stems from a mistake by one of the top law firms it hired to fight Oracle's lawsuit, which accuses Google of patent and copyright infringement in Android. It's a high-stakes case that could potentially cost Google billions of dollars in damages, and force it to start charging handset makers a license fee for Android.

Like many corporate lawsuits, this one began with a discovery phase. Each party is required to identify all the emails, chat logs and other documents relevant to the case, and produce them for the opposing legal team. Because there are often millions of documents involved, they use software tools to define date ranges, search for keywords and find the material they have to produce.

Communications discussing legal advice with attorneys are protected by attorney-client privilege, meaning they don't have to be made public. Google argued that its potentially incriminating email fell into this category.

It was written by Google engineer Tim Lindholm last August, a few weeks before Oracle filed suit against Google. At the time, Oracle had threatened to sue Google for billions of dollars, and Lindholm was instructed by Google executives to see what alternatives to Java existed for use in Android, apparently to strengthen their negotiating position.

"What we've actually been asked to do (by Larry and Sergey) is to investigate what technical alternatives exist to Java for Android and Chrome," the email reads in part, referring to Google co-founders Larry Page and Sergey Brin. "We've been over a bunch of these, and think they all suck. We conclude that we need to negotiate a license for Java under the terms we need."

Oracle's lawyers read from the email at two hearings over the summer. It made a big impact on Judge William Alsup of the U.S. District Court in Oakland, California, who warned Google's lawyers that it might suggest willful infringement of Oracle's patents.

"You're going to be on the losing end of this document with Andy Rubin on the stand," Alsup said, referring to Google's top Android executive.

In fact, Google's lawyers never should have produced the email in the first place. Later that evening they filed a motion to "claw it back" on the grounds it was "unintentionally produced privileged material." Oracle objected, and so began a three-month battle that culminated last week with Alsup's refusal to exclude the document at trial.

Lindholm's computer saved nine drafts of the email while he was writing it, Google explained in court filings. Only to the last draft did he add the words "Attorney Work Product," and only on the version that was sent did he fill out the "to" field, with the names of Rubin and Google in-house attorney Ben Lee.

Because Lee's name and the words "attorney work product" weren't on the earlier drafts, they weren't picked up by the e-discovery software as privileged documents, and they were sent off to Oracle's lawyers.

If the drafts hadn't been skipped over, it's entirely possible Lindholm's email never would have come to light, legal experts said. In the discovery phase, each side draws up a "privilege log" that lists the documents they're withholding, with a brief description of why they're privileged.

It's not rare for attorneys to challenge items in the privilege log, but it's also not an everyday occurrence. Since the Lindholm email was addressed to a Google attorney, it's entirely possible Oracle never would have challenged it, meaning its content never would have been made public, said Stephen Hall, a partner with the law firm Bradley Arant Boult Cummings.

"If they had found that document and put it on the privilege log, it might very well not have been an issue," he said.

"It would certainly have been a lot less likely" that the email would have ended up in the public record, said Rakesh Madhava, CEO of Nextpoint, which provides an online e-discovery service.

Once Oracle had seen the document, it fought to keep it in the open. A magistrate judge determined it was not privileged, because the body of the text was addressed to Rubin, not a lawyer, and because Lindholm wrote that he was acting at the behest of Page and Brin. That made it a business discussion about how to negotiate with Oracle, the judge concluded, and simply addressing it to a lawyer didn't make it privileged. Alsup last week agreed with her.

It's unclear exactly how the email drafts slipped through the net, and Google and two of its law firms did not reply to requests for comment. In a court filing, Google's lawyers said their "electronic scanning tools" -- which basically perform a search function -- failed to catch the documents before they were produced, because the "to" field was blank and Lindholm hadn't yet added the words "attorney work product."

But documents produced for opposing counsel should normally be reviewed by a person before they go out the door, said Caitlin Murphy, a senior product manager at AccessData, which makes e-discovery tools, and a former attorney herself. It's a time-consuming process, she said, but it was "a big mistake" for the email to have slipped through.

Such errors aren't uncommon, however, and probably happen in a lot of cases, Hall and Murphy both said. They are sometimes caused by software, but more often they result from human error, Murphy said, perhaps because keyword searches weren't properly constructed.

"In this case, it just so happened that the document in question was a crucial one," Hall said.

In fact, e-discovery is creating all types of problems for lawyers. The Federal Rules of Civil Procedure were updated in 2006 to include all forms of "electronically stored information," but grey areas remain, and most lawyers don't know all the rules, Nextpoint's Madhava said.

In addition, services like Facebook and Twitter are greatly increasing the volume of data that needs to be processed, and the fast-moving world of technology makes complying with the rules a moving target.

"Lawyers are slow to adapt," said Murphy. "Lawyers like paper and we're slow to adapt to these things."

In a recent case in Illinois, she noted, a team of lawyers accidentally handed 159 privileged documents to the opposing counsel because they misunderstand how their e-discovery software worked. The software flagged the documents as privileged, but the lawyers didn't realize they had to remove them manually before the file was sent to the opposing side.

Help may be at hand. Vendors are working on "predictive coding" technology that could automate the time-consuming review process. Legal staff feed documents into the system, tell it what disclosure category they fall into, and the software uses algorithms to apply those decisions back to all the remaining documents.

"It's kind of the new kid on the block," Murphy said.

A lot of lawyers approach the technology with trepidation, she said, because they're reluctant to trust a machine to ensure they'll produce the right documents. And courts have yet to sanction the software's use.

"However," Murphy added, "there have been several studies showing that these machines that apply the algorithms are actually more accurate than human reviewers."

James Niccolai covers data centers and general technology news for IDG News Service. Follow James on Twitter at @jniccolai. James's e-mail address is

Copyright © 2011 IDG Communications, Inc.

The 10 most powerful companies in enterprise networking 2022