Peter John Maridable/Unsplash

How Fake Data Can Help the Pentagon Track Rogue Weapons

The Air Force Research Laboratory bought software that trains machine-learning tools to spot groups amassing biological, nuclear and chemical weapons.

The Pentagon is investing in software that uses big data to help intelligence officers keep terrorists from getting their hands on biological, chemical and nuclear weapons.

The Air Force Research Laboratory on Tuesday announced a $4.6 million contract with the software company IvySys to model different ways state and non-state actors could obtain and deploy “weapons of mass terror” around the world.

The contract supports an ongoing effort by the Defense Advanced Research Projects Agency to build tools to spot groups who are potentially stockpiling materials for such weapons.

“Reports of chemical weapons use around the world raises serious concerns about non-state actors' access to weapons of mass terror and reinforces fears of a possible terrorist attack with chemical, biological, radiological, or nuclear weapons in the West,” DARPA and IvySys said in a statement. “Today’s terrorist networks move operatives, money and material across borders and through the crevices of the global economy, making tracking such adversaries a daunting challenge.”

The technology would generate fictional but realistic datasets of bank transactions, emails and inventory transfers, and embed them with indicators of suspicious activities, like a shipment of toxic chemicals getting intercepted or a banker doing business with terrorist-connected client. Agencies could then use the software to train algorithms and machine learning tools to pick up on threatening behavior buried within massive datasets.

IvySys Founder and Chief Executive Officer James DeBardelaben compared the process to repeatedly finding a needle in a haystack, but making both the needle and haystack look different every time. Using real-world data, agencies can only train algorithms to spot threats that already exist, he said, but constantly evolving synthetic datasets can train tools to spot patterns that have yet to occur.

The software will eventually generate massive datasets containing 10 billion variable nodes and 1 trillion individual transactions. It will also allow users to make threat patterns easier or harder to find.

“It’s an enabling technology that allows [agencies] to build robust threat detection software tools,” DeBardelaben told Nextgov. “It gives them data to train against and it allows them to vary the threats that their tool can detect.”

IvySys began working on the project in September 2017, he said, and the DARPA contract will run for four years.