What’s the harm with metadata?

Thanks to Edward Snowden, CSEC has been outed as just as underhanded and illegal as the US‘s NSA – spying on innocent Canadians in violation of both reason and their legal mandate. But taking a page from the NSA‘s defence when they were caught red-handed, CSEC claims what they were doing was not really spying, just mostly spying, because they were only collecting metadata about law-abiding Canadians, not data. Oh, so it’s just the metadata. That means it’s okay, right?

The Snowden revelations have been coming so fast and so hard that it’s no surprise the spymasters in the US and their allies are tripping over themselves trying to justify their obviously illegal and unethical operations. The most recent “defence” – one even used by President Barack Obama – that it’s okay for the spy apparatus to indiscriminately track and keep data on law-abiding citizens because it’s just “metadata”, not actual data about what you’re saying or doing:

When it comes to telephone calls, nobody is listening to your telephone calls. That’s not what this program is about. As was indicated, what the intelligence community is doing is looking at phone numbers and durations of calls. They are not looking at people’s names, and they’re not looking at content. But by sifting through this so-called metadata, they may identify potential leads with respect to folks who might engage in terrorism.

ORLY, Mr. President?

The Harper government, being the little US boot-licking toadies that they are, were quick to appropriate this “metadata defence” to justify their own obviously illegal and unethical actions.

But what exactly is metadata?

Well, every time you make a phone call, or send a text or email, the content of those things is clearly protected by privacy laws – no government agency can legally obtain the content without a warrant. However, while the content – the information within the communication – is clearly out of bounds, the information about the communication is a legal grey area that these spy organizations are claiming is fair game. In other words, they can’t legally access the content of your discussion… but they say they have unrestricted right to information about who the discussion participants were, where it took place, when it took place, and how long it lasted.

The argument from the spy czars and their enablers is that collecting this data indiscriminately and from people who are under no sort of suspicion of illegal activity doesn’t constitute a breach of privacy. There is no shortage of assholes – like this guy, from the odious Rand Corporation – who want to try to argue, hey, it’s all cool, they’re not doing anything inappropriate with all this data they collected indiscriminately about innocent people. They’re not actually spying on you, they’re just, ya know, spying around you. So, we cool now?

Here’s the thing. They data about your communications – the metadata – is not only actually more useful to spy agencies than the actual content of the communications themselves in most cases… it’s far more dangerous to you if someone is collecting and using this information to draw conclusions about you.

The technology for automating the snooping of actual content of communications does not yet exist. In the few cases where there is some capability – such as the parsing or email or text messages – the technology is in its infancy and still needs human eyes as a check before any action can be taken. On top of that, the content is usually much bigger than the metadata – the metadata about a phone call could take up just a few dozen bytes, while the phone call data itself could take up several dozen megabytes, even compressed. It’s simply more efficient, and more practical, to store and search metadata. On top of that, metadata is often actually more useful to investigators, because while the content of a single call may be either interesting or mundane, each bit of metadata collected can be used to discover patterns of behaviour that can actually allow you to make predictions about what the subject will do next that the subject zeself can’t even make. Put another way, if you want to trick eavesdroppers and lead them on a wild goose chase, it’s trivial to perform a few scripted phone calls with misleading information to sucker them… but it’s a lot harder to fake whole patterns of behaviour that stretch over weeks, months, maybe even years.

As for why metadata collection and mining is actually more dangerous to you… well, here are some examples to consider:

Mr. Smith is a well-respected businessman and politician, who works hard for his constituents and does right by his employees. Over several years’ worth of metadata collection, it is now known that Mr. Smith normally leaves work at around 1730h every day and goes home for the evening. Every so often Mr. Smith will phone from his office to his wife at home sometime between 1500h and 1730h, and when that happens, his wife usually calls to order Chinese food delivery not long after, and Mr. Smith stays at work until well after 1730h, going home late.

One day, Mr. Smith exchanges several phone calls with one of his coworkers, and makes a single call to his wife around 1630h. His wife orders Chinese food in as usually happens on one of these days. However, Mr. Smith does not stay at work late. Instead, he leaves at the normal time – 1730h – and GPS tracking in his phone shows he goes to a motel that rents rooms by the hour. The coworker he called several times that day also goes to the same motel room. He and the coworker stay in that room for an hour and a half, then they both leave to go to their own homes.

Note that we have absolutely no information about the content of the calls, or what was actually done or discussed at any point. Nevertheless, I think you can deduce what is going on.

Here’s another example:

Your phone’s metadata records show that you made three calls within the space of an hour:

  1. First, to an HIV testing clinic.
  2. Second, to your doctor.
  3. Third, to your health insurance provider.

Again, we have no idea about the content of these calls. All we have are the times the calls were made and who they were made to.

Here’s a final example:

Mahmoud Jasser has been tracked for many years, and regularly attends mosque, communicates with other devout Muslims, and goes to religious websites.

Starting a few months ago, Jasser suddenly started making calls to several fundamentalist Muslim groups, and searching for information about them online. Over the course of half a year, he began calling more and more extremist Muslim organizations, and started going to websites associated with jihad and violent extremism.

Gee, sounds ominous, doesn’t it? Surely we should put this guy on a no-fly list right away!

Only… what if he’s just an author researching a book on violent extremism in Islam? Or what if he’s concerned about a friend who is getting involved in these groups, and he’s trying to get more information about them to intervene?

It is hard to underestimate just how much information about us can be inferred from metadata. Metadata can be enormously dangerous – even more dangerous that the data in the communication itself – not only because so much information can be gleaned from it, but because it can too easily become circumstantial evidence, leading to guilt by mere association.

This was actually why CSEC‘s metadata gathering operations were shut down back in 2008. It’s true. The original program was set up by Paul Martin’s defence minister Bill Graham in 2005, in a secret order not discussed in Parliament. (So, yes, the Liberals are involved in this scandal, too.) It was shut down because Charles Gonthier – retired Supreme Court justice, who was Commissioner of the CSE at the time – asked some uncomfortably probing questions about the sharing of metadata between CSEC and CSIS or the RCMP. Gonthier observed that the RCMP could get around needing warrants if they just used the metadata CSEC collected and shared freely. It was a dangerous and embarrassing question – highlighting just how illegal CSEC‘s activities really were, despite claims – so the whole program was quietly shut down. It was restarted again in 2011 by Peter Mackay, once again with a secret ministerial directive, and no discussion in Parliament (along with other espionage programs that we have yet to find out about).

Don’t fall for the dismissals and the assurances that metadata is harmless. Metadata is dangerous – especially dangerous because it is divorced from context, and easily susceptible to mistaken conclusions formed by the watcher’s preconceptions. Any organization secretly collecting metadata about people cannot have their best interests in mind.

CC BY-SA 4.0
What’s the harm with metadata? by Indi in the Wired is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Leave a Reply

%d bloggers like this: