In May, the IETF published RFC7258, Pervasive Monitoring Is an Attack. No matter where you stand in regards to the IETF process (observer, confused, or, like the pig making breakfast for the farmer — completely committed), this is an odd RFC. In fact, it was probably the single most discussed RFC draft in recent history, across all IETF lists, generating heat along the lines of the MANET wars (though perhaps larger, because the stakes are so high here). This single RFC actually resulted in the formation of a new “non working group mailing list,” PERPASS, which deals specifically with how the IETF should handle privacy concerns (you can — and SHOULD — join perpass here). A lot of the problems with this draft revolve around precise definitions (humans are creatures of language — we literally can’t think about things without creating the words to describe them). With this in mind…
What is pervasive monitoring? The draft defines it as …”widespread (and often covert) surveillance through intrusive gathering of protocol artefacts, including application content, or protocol metadata such as headers.” But what does that mean? Essentially, it means gathering any and all information available about a wide swath of people, such as the keywords searched for, web sites visited, blog posts written, social media updates, etc., across every person in the United States. The information gathered could, in fact, involve much more than this, such as bank transactions, books read, videos watched, organizations joined, and even such things as the rate at which you type while online, spelling mistakes you make, locations you visit specific sites from, and other information you might not think is so important.
The key point is the word pervasive, along two directions: the type of information being monitored (everything imaginable, including things you might not consider private), and the number of people being monitored (everyone, in essence). As the draft says, “PM is distinguished by being indiscriminate and very large scale…”
Why is pervasive monitoring bad? Other than calling pervasive monitoring an attack, the draft doesn’t really give a solid set of reasons for the assessment that it is, in fact, an attack. A couple of specific points might be helpful here.
One of the various reasons we don’t have much privacy in the online world is that people simply don’t realize the amount of information they leak on a daily basis. We have little idea of the power of data analytics to reconstruct our lives in ways that better describe who we are than we could, ourselves, describe our own lives. The ability of a bookseller to recommend a new book based on prior purchases might be convenient — or it might be unsettling. The ability of a search engine to pin what you will look for next might be really cool, or it might be really freakish. It’s the surprise effect that is so offputting — does the machine really know more about me than I do? Pervasive monitoring undermines the open Internet by making us ask: if the machine knows this about me, what else does it know that I don’t?
Beyond this, however, there is another impact. I do business on the Internet because I trust my bank, provider, and others to play by honest rules. Once I realize the security mechanisms on my bank account can be overcome or undermined by a “state actor,” I realize that the security isn’t foolproof. These folks aren’t looking into my banking account through the front door, they’re doing it through the back door.
Which prompts the question: If there is a back door, then who else has a key?
An honest disc encryption software package might need to say, “We’ll protect your data from everyone except anyone who looks important, or is holding a gun.” Not much protection in real life, is it?
Once we get to this point, trust starts to erode — and the Internet, at large, and networks in the more specific, become less trusted environments within which to do business. In turn, this harms the growth of the network — and hence, is an attack on the infrastructure itself.
What can we do?
First, we need to realize there is the inevitable complexity tradeoff here. I’m not convinced you can design a system that reacts well to events within the state machine and outside the state machine with equal effect. Fixing everything actually breaks a lot of stuff, in the end. So we’re going to have to take a realistic view of this problem, thinking through the tradeoffs rather carefully.
Second, at some point, we are going to have to decide which we really want. We want the convenience, it seems, but we also want the privacy. Really good book recommendations, or real privacy. There’s not a lot of middle ground in play — as Bruce Schneier said in the plenary discussion about the perpass problem, it’s not that government agencies decided to gather all this stuff up into one big database. It’s that all this information already existed, and government organizations just decided they wanted a copy.
Should I have a version of Amazon that allows me to forgo the book recommendations in favor of more privacy? Will Amazon allow such a thing given how much of their revenue model is driven by book recommendations? These are questions we need to ask — and begin to answer.
RFC7258 is at least starting us down the right path.