The NSA’s Terrorist-Hunting Computer for Pakistan May Have Targeted Innocents

Pakistani men, who were displaced with their families from Pakistan's tribal areas due to fighting between the Taliban and the army, sit on a roadside on the outskirts of Islamabad, Pakistan, Saturday, Nov. 2, 2013.

Muhammed Muheisen/AP

AA Font size + Print

Pakistani men, who were displaced with their families from Pakistan's tribal areas due to fighting between the Taliban and the army, sit on a roadside on the outskirts of Islamabad, Pakistan, Saturday, Nov. 2, 2013.

A new report suggests that the agency has been using a machine-learning program to identify potential terrorists, but thousands of Pakistanis may have been mislabeled.

Artificially intelligent programs can learn to walk like we do, beat game shows and board games, and even drive cars. But are they ready to find and fight enemies for us?

A new report from Ars Technica suggests that the US National Security Agency (NSA) has been using a machine learning program to identify potential terrorists in Pakistan—but its methodology may have led to thousands of innocent Pakistanis being mislabeled.

The NSA’s program—inexplicably named SKYNET, like the homicidal AI program of the Terminator film franchise—was first unveiled by documents leaked by Edward Snowden to The Intercept in 2015. According to a leaked 2012 government Powerpoint presentation, SKYNET uses “analytic triage” to calculate the probability that individuals are terrorists, using an 80-point analytical test, that evaluates factors like a person’s phone calls, location, social media activity, and travel patterns.

The system apparently flagged Al-Jazeera’s Islamabad bureau chief Ahmad Zaidan as a potential target, the Intercept’s data showed, as he often travels to conflict areas to report.

In the leaked slides, NSA claimed that SKYNET has a false-positive rate of only 0.008%, in certain instances. But Pakistan has a population of about 182 million, and the NSA was using phone records from about 55 million people for SKYNET. As Ars points out, even a that minute rate, many innocent people are likely to end up mislabeled. Some of the NSA’s tests in the leaked slides saw error rates of 0.18%, which could mean mislabeling about 99,000 people out of the 55 million.

SKYNET can be compared to the machine learning systems that businesses use to find sales leads—both methods learn a person’s traits, and compares them to model profiles based on those traits. SKYNET was trained by feeding the system with the data from 100,000 random people, and seven known terrorists. It was then tested with the task of identifying one of those seven terrorists. What’s troubling is that SKYNET does not appear to have been tested with new data, which would have shown whether the system could work in new situations, according to an expert who examined the leaked slides for Ars.

“There are very few ‘known terrorists’ to use to train and test the model,” Patrick Ball, a data scientist and director of the Human Rights Data Analysis Group, explained to Ars Technica. “If they are using the same records to train the model as they are using to test the model, their assessment of the fit is completely bullshit.”

It’s not clear yet what purpose SKYNET serves. Although it could be part of non-violent surveillance activities, such as monitoring suspected terrorists’ travel patterns, Ars suggests the technology could potentially be used to target drone strikes. Since 2004, the US government has carried out hundreds of drone strikes in Pakistan against alleged terrorists, according to the Bureau of Investigative Journalism.

Last year, the UN warned against nations developing autonomous weapons, due to concerns about what they might do without a human’s moral judgement. The NSA was not immediately available to comment on how SKYNET was used, or how it was trained.

Close [ x ] More from DefenseOne

Thank you for subscribing to newsletters from
We think these reports might interest you:

  • Software-Defined Networking

    So many demands are being placed on federal information technology networks, which must handle vast amounts of data, accommodate voice and video, and cope with a multitude of highly connected devices while keeping government information secure from cyber threats. This issue brief discusses the state of SDN in the federal government and the path forward.

  • Military Readiness: Ensuring Readiness with Analytic Insight

    To determine military readiness, decision makers in defense organizations must develop an understanding of complex inter-relationships among readiness variables. For example, how will an anticipated change in a readiness input really impact readiness at the unit level and, equally important, how will it impact readiness outside of the unit? Learn how to form a more sophisticated and accurate understanding of readiness and make decisions in a timely and cost-effective manner.

  • Cyber Risk Report: Cybercrime Trends from 2016

    In our first half 2016 cyber trends report, SurfWatch Labs threat intelligence analysts noted one key theme – the interconnected nature of cybercrime – and the second half of the year saw organizations continuing to struggle with that reality. The number of potential cyber threats, the pool of already compromised information, and the ease of finding increasingly sophisticated cybercriminal tools continued to snowball throughout the year.

  • A New Security Architecture for Federal Networks

    Federal government networks are under constant attack, and the number of those attacks is increasing. This issue brief discusses today's threats and a new model for the future.

  • Information Operations: Retaking the High Ground

    Today's threats are fluent in rapidly evolving areas of the Internet, especially social media. Learn how military organizations can secure an advantage in this developing arena.


When you download a report, your information may be shared with the underwriters of that document.