Online profiling: DPI's bad, data mining's worse

What to make of Google's e-mail scanning

Congress seems concerned about carrier deep packet inspection techniques, even though it essentially required them to adopt the technology back in the 1990s. Oh, and what about Google's e-mail search methods?

Congress recently issued a request to carriers, telecom providers and ISPs to explain exactly how, and under what circumstances, they're inspecting user online content. Specifically, they're concerned about deep packet inspection (DPI) — a generic name for technologies that enable service providers to capture and inspect packet flows.

Oh boy…talk about "have you stopped beating your wife yet?" Apparently the folks in Washington, D.C., have short memories. Back in 1994 Congress passed the Communications Assistance for Law Enforcement Act (CALEA), which mandates that carriers have the ability to capture and inspect packet flows (and forward them to law enforcement agencies) — which pretty much requires DPI.

But that's not all. As AT&T points out in its response to the Congressional request, if the real concern is tracking online behavior, DPI is a red herring. Search and application vendors such as Google regularly scan user content and use data mining techniques to build online profiles of users.

Specifically, Google routinely searches through any e-mails sent or received within Gmail to enable it to provide "customized content and advertising." And these e-mail scans are also cross-correlated with Web searches. For example, Google may note that I mentioned plans for a trail hike in an e-mail to a friend, then conducted a Web search hours later for "trail shoes".

There are two key points here. First is that if the Feds think DPI is a bad idea, they shouldn't have written laws that essentially require it. Second, if you think DPI is bad — data mining is plenty worse. As noted above, Google and others are actively scanning e-mails on a regular basis today — something carriers don't do.

In short, if you hate DPI, you should despise data mining.

But weirdly enough, the same folks who castigate carriers for DPI often defend search engines and application vendors for data mining. The most common defense is that search engines are "opt-in".

Sorry, guys, that's bogus. All content stored on Google's site is scanned. That includes mail to a Gmail account — even if the sender didn't realize it was being delivered. As the good folks at the Electronic Privacy Information Center note: "Non-subscribers who are e-mailing a Gmail user have not consented, and indeed may not even be aware that their communications are being analyzed or that a profile may being compiled on him or her."Moreover, many cash-strapped organizations, such as schools and universities, are planning to outsource their e-mail to Google, thereby requiring students to hold Gmail accounts. No opt-out options available.

The bottom line? The United States sorely needs a privacy policy that will articulate what service providers can and can't do with user data — and under what circumstances. That policy should apply to search and applications vendors as well as telcos and ISPs. And it shouldn't contain contradictions, such as simultaneously disallowing and requiring DPI.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:
Now read: Getting grounded in IoT