Events like the shootings in Tennessee show the possibilities and limitations of predictive analytics.
Last week, when a young Kuwaiti-born citizen named Muhammad Youssef Abdulazeez killed four Marines and one sailor near Chattanooga, Tennessee, the chattering classes quickly took to the airwaves, like a Greek chorus asking the Fates: why didn’t anyone predict this would happen?
“Nothing helped us predict this, and it makes you more concerned as a public official," Tennessee Gov. Bill Haslam recently remarked on Fox and Friends.
Official concern about an inability to predict so-called lone wolf attacks belies a certain level of frustration and impatience, and an assumption that law enforcement simply didn’t use all the tools at their disposal. If Amazon can predict what you’ll buy next, if Google Now knows when you’ll arrive at your destination, if Facebook can infer when your relationship is going to tank, why can’t law enforcement anticipate when someone like Abdulazeez will commit an act of murder? Let’s take a look.
In Search of the Lone Wolf
Abdulazeez’s recently recovered diary shows dependence on drugs and alcohol. He reportedly frequented a gun range and was in severe debt. In other words, his troubles weren’t particularly exotic or un-American. His father, a native Palestinian, was briefly on a terror watch list but was removed. Abdulazeez spent several months in Jordan last year, reportedly in an effort to deal with personal substance issues.
An FBI analysis of his Google search engine queries revealed a growing interest in martyrdom. In 2013, he reportedly downloaded several videos of Al Qaeda cleric Anwar al-Awlaki. His most extreme writings related to Islam come in the form of two religious posts not long before the shooting. They cite martyrdom as a goal but give no explicit indication of an impending attack. The FBI is continuing to investigate Abdulazeez’s social media presence, looking for communication exchanges between Abdulazeez and members of known groups like the Islamic State. But none have the fact that no such communication indicators have emerged so far, and that suggests that they likely won’t any time soon.
In retrospect, those various actions and clues may suggest a terrorist ready to pop. And that’s the danger of retrospective analysis. Even those indicators that have the appearance of being indicative don’t speak to a cause of violence. Rather, they are unremarkable behaviors that are correlated with a violent incident. (Many people, including Defense One reporters, have seen videos featuring Anwar al-Awlaki and yet have resisted homicidal behavior.) The amount of open-source data available to law enforcement, or journalists, makes correlational analysis easier, but not necessarily more valid.
If you had no limitation on the type of data you could gather on every person in the United States in order to predict an event of political violence, what would you collect? What are the potentially observable behaviors that actually predict a lone wolf attack?
Some of the best thinking on this subject comes from J. Reid Meloy, a forensic psychologist and consultant to the Behavioral Analysis Units of the FBI at Quantico. In a highly cited 2011 paper, Meloy and several co-authors compile from all of psychological literature a list of eight warning behaviors that are the best indicators of the means and will to carry out a violent attack like the one that took places in Tennessee. They are:
1. Pathway: researching behavior related to a possible attack, such as looking up routes, target weaknesses, etc.
2. Fixation: a pathological obsession with the target.
3. Identification: a willingness of the attacker to commit violence as an agent of a particular belief system.
4. Novel aggression: acts of violence not directly related to an attack (such as fighting.) “Such behaviors test the ability of the subject to actually do a violent act.”
5. Energy burst: an increased activity related to the target, an example would be accelerated stalking.
6. Leakage: defined as “the communication to a third party of an intent to do harm to a target.”
7. Directly communicated threat.
8. Last resort: words or actions that speak to increased desperation or distress, a call for help.
Of course, all of these behaviors don’t have to be present for someone to be a threat. And “prevention does not require specific prediction,” Meloy cautioned in an email to Defense One. But if you wanted to predict whether a particular individual was going to launch an attack, you would seek out data that could show whether these behaviors were present.
Today, law enforcement has access to more of that data than ever before, and much of it open source, through social media and other means whose collection present no particular legal challenge. For that matter, so does Facebook and Google. One 2013 white paper from Facebook calculated that the site's user body uploads 350 million new photos a day, many of them public. This explains why such companies have multi-billion dollar valuations even though they give most of their services away to consumers for free. They’re monetizing consumer data to automatically display ads based on key parameters across the entire expanse of a person’s digital comings and goings.
A person’s Google search history can reveal pathway and fixation behavior. Blog posts, profile facts and group membership roles can reveal identification (among other things). Arrests, human resource records or disciplinary notes and weapons purchases can speak to novel aggression. Geo-location data can provide data on stalking behavior (though going to Jordan does not constitute a threatening activity). Social media and blog posts are by definition leakage. Direct threat communication and last-resort behaviors are overt.
In other words, with every digital interaction, Abdulazeez was creating the information that authorities could have used to predict his attack, but it was spread across a variety of digital services and devices. Most of it would have been password-protected or outside of easy, or legal, open-source collection. One day, it may be possible to create an engine that scours all digital information for evidence of the above eight behaviors by one person. Until that happens, merging these disparate pieces of data together to fuel a global, real-time threat screener that can be applied to the population at large will remain impossible for all practical purposes. A 2013 paper from the Swedish Defense Research agency makes that point explicitly:
“To produce fully automatic computer tools for detecting lone wolf terrorists on the Internet is, in our view, not possible, both due to the enormous amounts of data (which is only partly indexed by search engines) and due to the deep knowledge that is needed to really understand what is discussed or expressed in written text or other kinds of data available on the Internet, such as videos or images.”
After the incident occurs, the barriers fade away and the trail appears obvious. Every event can be predicted with enough data, but not every event can be predicted in time to avert tragedy.