Artificial Intelligence Outperforms Human Intel Analysts In a Key Area

A Defense Intelligence Agency experiment shows AI and humans have different risk tolerances when data is scarce.

Patrick Tucker

April 29, 2020

By Patrick Tucker

Science & Technology Editor

April 29, 2020

In the 1983 movie WarGames, the world is brought to the edge of nuclear destruction when a military computer using artificial intelligence interprets false data as an imminent Soviet missile strike. Its human overseers in the Defense Department, unsure whether the data is real, can’t convince the AI that it may be wrong. A recent finding from the Defense Intelligence Agency, or DIA, suggests that in a real situation where humans and AI were looking at enemy activity, those positions would be reversed.

Artificial intelligence can actually be more cautious than humans about its conclusions in situations when data is limited. While the results are preliminary, they offer an important glimpse into how humans and AI will complement one another in critical national security fields.

DIA analyzes activity from militaries around the globe. Terry Busch, the technical director for the agency’s Machine-Assisted Analytic Rapid-Repository System, or MARS, on Monday joined a Defense One viewcast to discuss the agency’s efforts to incorporate AI into analysis and decision-making.

Earlier this year, Busch's team set up a test between a human and AI. The first part was simple enough: use available data to determine whether a particular ship was in U.S. waters.

“Four analysts came up with four methodologies; and the machine came up with two different methodologies and that was cool. They all agreed that this particular ship was in the United States,” he said. So far, so good. Humans and machines using available data can reach similar conclusions.

The second phase of the experiment tested something different: conviction. Would humans and machines be equally certain in their conclusions if less data were available? The experimenters severed the connection to the Automatic Identification System, or AIS, which tracks ships worldwide.

“It’s pretty easy to find something if you have the AIS feed, because that’s going to tell you exactly where a ship is located in the world. If we took that away, how does that change confidence and do the machine and the humans get to the same end state?”

In theory, with less data, the human analyst should be less certain in their conclusions, like the characters in WarGames. After all, humans understand nuance and can conceptualize a wide variety of outcomes. The researchers found the opposite.

“Once we began to take away sources, everyone was left with the same source material — which was numerous reports, generally social media, open source kinds of things, or references to the ship being in the United States — so everyone had access to the same data. The difference was that the machine, and those responsible for doing the machine learning, took far less risk — in confidence — than the humans did,” he said. “The machine actually does a better job of lowering its confidence than the humans do….There’s a little bit of humor in that because the machine still thinks they’re pretty right.”

The experiment provides a snapshot of how humans and AI will team for important analytical tasks. But it also reveals how human judgement has limits when pride is involved.

Humans, particularly experts in specific fields, have a tendency to overestimate their ability to correctly infer outcomes when given limited data. Nobel-prize winning economist and psychologist Daniel Kahneman has written on the subject extensively. Kahneman describes this tendency as the “inside view.” He cites the experience of a group of Israeli educators assigned to write a new textbook for the Ministry of Education. They anticipated that it would take them a fraction of the amount of time they knew it would take another similar team. They couldn’t explain why they were overconfident; they just were. Overconfidence is human and a particular trait among highly functioning expert humans, one that machines don’t necessarily share.

Related podcast:

The DIA experiment offers an important insight for military leaders, who hope AI will help make faster and better decisions, from inferring enemy positions to predicting possible terror plots. The Pentagon has been saying for years that the growing amount of intelligence data that flows from an ever-wider array of sensors and sources demands algorithmic support.

DIA’s eventual goal is to have human analysts and machine intelligence complement each other, since each has a very different approach to analysis, or as Busch calls it, “tradecraft.” On the human side, that means “transitioning the expert into a quantitative workflow,” he says. Take that to mean helping analysts produce insights that are never seen as finished but that can change as rapidly as the data used to draw those insights. That also means teaching analysts to become data literate to understand things like confidence intervals and other statistical terms. Busch cautioned that the experiment doesn’t imply that defense intelligence work should be handed over to software. The warning from WarGames is still current. “On the machine side, we have experienced confirmation bias in big data. [We’ve] had the machine retrain itself to error...That’s a real concern for us.”

NEXT STORY: The Flying Car Of the Future Looks to Flying Cars of the Past