One company is using metadata from video posts, Wikipedia entries, and other sites to forecast geopolitical unrest.
Editor's note: Following the publication of this story, YouTube removed the video in question late Tuesday for violating its terms of service.
A YouTube video’s best day, traffic-wise, is usually the day it gets posted. Clicks generally decline quickly and post-launch spikes are rare. On Dec. 18, a year-old jihadist video called “Black Flags of Islam and Imam Mahdi” saw just such a spike, receiving enough views to reach about 70 percent of its best day’s traffic. Eight days later, an ISIS-affiliated suicide bomber detonated an explosive belt at an Ahmadi mosque in the Bangladeshi town of Bagmara, an unexpected uptick in Islamic State tactics in the country.
The video, also known as the “Black Flags of Khorasan,” saw another spike on Jan. 3; a week later, ISIS militants in boats staged a daring attack on the Libyan port of Zueitina. The story repeated itself on Jan. 21 and Feb. 12.
Narrated in English, the 26-minute video calls to “soldiers of Allah” and promises “killing upon killing upon killing.” For some ISIS fighters, it’s their “version of listening to AC/DC before weight-lifting,” said Scott T. Crino, a managing director of Predata, a predictive analytics company. “It gets them psyched up. So, often there’s a big spike in that particular [video], prior to an event occurring.”
Predata specializes in finding links between online interactions and upcoming physical events. It’s the latest member of a burgeoning field . Consider the research that has gone into trying to understand how the hot Google searches of the moment reflect what’s going on in the world. So far, the results have been mixed. Lots of people Googling how to collect unemployment is a good predictor of unemployment data as measured by the Labor Department (because published data is a lagging indicator), but spikes in searches for flu symptoms or remedies is not a good indicator of the number of people who actually have the flu.
The green line shows daily views of the “Black Flags” video. The yellow boxes note events that happened within a week of the viewership spikes. (Predata)
Like Google Trends , Predata measures interest around a given topic. Crino calls it “chatter,” and uses it to forecast “unrest,” which can be a terrorist attack, a protest, or something else unplanned. But Predata also looks at a variety of sites and services, including YouTube, Wikipedia, and Disqus, watching not so much what people are saying as how they are interacting. An argument in the Disqus-powered comments section of a particular blog post, for example, may suggest contention and future unrest. In 2014, heavy commenting on news articles about Russia and Ukraine preceded Moscow’s annexation of Crimea.
Predata officials also say Wikipedia edits can help predict unrest. A flurry of attempted edits to the page of a particular controversial figure or surrounding a particular incident suggest contention that may spill over into the physical world.
“Things like point of view are not allowed on Wikipedia, and so someone monitoring will come in and say, ‘No, not allowed,’” Crino said. “That will keep going back and forth. Often, the person that’s editing the page will be trying to create what they think is the new normal.”
In the months before November’s Paris attacks, the French-language ISIS Wikipedia page saw particularly heavy changes. That signal, plus others, led the Predata system to raise its prediction for a terror attack in France several times in the six weeks leading up to the November terror attacks said Joshua Haecker, Predata’s director of business development.
“You can see a huge spike up in likelihood of terrorist attacks, at 60%, in France on September 11, 2015, and then it drops back down for a few days and then steadily clumps from 28% up to 49%, the day before the attack,” Haecker said .
A spike in predicted activity on Sept. 11, 2015, preceded the November 13 attacks in Paris. (Predata)
Data from Wikipedia edits can also be used to predict how relationships between individual users will develop, how friendships and antagonisms will form. In 2010, data scientist Jure Leskovic showed that he could use 16 kinds of data to predict friend or foe relationships on Wikipedia (as well as epinions and Slashdot ) with higher than 80 percent accuracy.
Why is the metadata surrounding web traffic or Wikipedia edits a better predictor than Google searches? For one thing, it’s less noisy; it’s easier and more accurate to count site visits, edits, and reverts than trying to parse what someone means when she Googles “flu.” (The use of metadata rather than semantic data related to literal text is also what separates the Predata platform from Recorded Future , a company supported by In-Q-Tel, the CIA’s investment arm. )
Of course, the technique has limitations. For instance, it can’t predict an event for which little data has yet been created, so much as it can forecast the continuance or abatement of a current trend. It can’t predict a new ISIS terrorist attack in a country that has never experienced one. And it can’t see much less than a month out, so you can’t forecast an event that will happen, say, tomorrow. Predata can, however, adjust its model for the number of Internet users in a country, since a lot more people are online in, say, Nigeria than North Korea.
The group offers a weekly newsletter that features predictions — and notes about events it predicted. For instance, the April 17 newsletter noted, “The continued elevation of Abu Sayyaf signals coupled with relatively high signal levels over the past month for several potential terrorist targets, such as the Manila Light Rail Transit System and the SM Megamall in Manila, raise a definite concern. Given this unusual combination of elevated signal levels, Predata anticipates another terrorist attack in the Philippines is likely within 30 days.”
On May 1, two blasts killed more than 14 people in General Santos City.
What else can you use Predata’s services for? The Bloomberg news site has begun experimenting with its predictions about the volatility of stocks and other asset classes. If you’re in the national security business, you might use it to predict the geopolitical weather, in the same way the movement of high-pressure fronts predict storms. Ultimately, it’s information you can use to decide what to wear for going out, or to not go out at all.
“We have a client who gave us 3,200 different types of events,” recounts Crino. “We were able to place those events within a particular province [in Egypt.]. And they have an interest in some provinces, and not others, because they are an oil exploration company. When the level rises, the likelihood or level of an attack occurs and it’s greater than 50 percent, they totally change their work posture. They send them on different routes to work or they don’t let them go in at all.”
Company officials say that a handful of officials within the Pentagon and State Department will begin using Predata on a trial basis later in May. There are folks in other government agencies also using the platform, agencies that Predata can’t name. For a sense of who they could be, consider that the company’s founder, James Shinn, was the CIA’s national intelligence officer (NIO) for East Asia for many years.
As for the near future, one of the group’s April newsletters offers this to look out for: “The North Korean discussion around the KN-08 rocket spiked well above average levels on April 15, 8 days before the submarine-launched ballistic missile test. Prediction levels remains elevated for a WMD test within the next 30 days, raising the concern that Kim Jong-un may seek to conduct another Nuclear test before May’s Party Congress.”
The problem with predicting future geopolitical events is…the news is never good.