In this Friday, May 20, 2016 photo, Hartford police Sgt. Johnmichael O'Hare, left, shows Connecticut Gov. Dannel P. Malloy the police department's Real-Time Crime and Data Intelligence Center in Hartford, Conn. Dave Collins/AP

Science & Tech

US Spies Are Building Software to Spot Your Suspicious Behavior In Live Video

The program is called Deep Intermodal Video Analytics—or DIVA—and it seeks to locate shooters and terrorists before they strike.

Aliya Sternstein

June 6, 2016

By Aliya Sternstein

Senior Correspondent

June 6, 2016

The intelligence community is working on amping up people-recognition power to spot, in live videos, shooters and potential terrorists before they have a chance to attack.

Part of the problem with current video surveillance techniques is the difficulty of recognizing objects and people, simultaneously, in real-time.

But Deep Intermodal Video Analytics, or DIVA, a research project out of the Office of the Director of National Intelligence, will attempt to automatically detect suspicious activities, with the help of live video pouring in through multiple camera feeds.

ODNI’s Intelligence Advanced Research Projects Agency is gathering academics and private sector experts for a July 12 "Proposers’ Day," in anticipation of releasing a work solicitation.

“The DIVA program will produce a common framework and software prototype for activity detection, person/object detection and recognition across a multicamera network,” IARPA officials said in a synopsis of the project published June 3. “The impact will be the development of tools for forensic analysis, as well as real-time alerting for user-defined threat scenarios.”

In other words, the tech would scour incoming video surveillance and body-camera imagery from areas of interest for people and objects who could present a threat, or individuals and items that might have been involved in a past crime.

This is the type of video-recognition system that might have been used for identifying would-be suicide bombers before the Paris and Brussels attacks, some video analytics experts say.

Privacy laws in the United States and Europe differ, so it is unclear whether such activity-recognition software would have been legal to use on video around the time of the 2013 Boston Marathon bombings. Nextgov has contacted ODNI for comment.

What Is Sticking out of That Spectator’s Backpack?

The envisioned system will provide multiple levels of granularity, according to IARPA.

One perspective would flag "primitive activity," like people getting into or out of a car, or someone carrying an object, the synopsis states. In this experimental scenario, the video would be collected from security cameras.

Another feature would key in on complex activities: someone carrying a firearm or two people exchanging an object, according to IARPA.

The most sophisticated capability would recognize people and things in live footage from many angles, showing different perspectives of areas of interest.

This pinnacle of the program involves “person and object detection and recognition across multiple overlapping and non-overlapping camera viewpoints," IARPA officials said.

These last two experiments will take advantage of body-cam video feeds and handheld video camera images, the synopsis states. Some of the sensors also might capture infrared data and video from other portions of the electromagnetic spectrum not visible to the human eye.

Participating teams are expected to consist of experts from many technical disciplines, including artificial intelligence, probability, person re-identification, and 3-D reconstruction from video.

The intelligence community anticipates academic institutions and private sector companies from around the world to join in.

Correctly identifying individuals before they can attack requires a system with not only keen recognition but also lots of video, data profiles and pictures of faces.

The technology must have access to a bad-guy database that already contains the would-be perpetrator’s face, cameras that can capture usable images of people approaching an area, and a way to signal guards or otherwise cut off access to the target, Defense One reported shortly after 32 people were killed in bombings on the Brussels metro and at the city's international airport March 22.

A failure to stop the same extremists behind last November's deadly terrorist attacks in Paris from carrying out the Brussels bloodshed, highlighted inadequate government intelligence on terrorists and communities’ lack of trust in police.

Tagging Faces and Things in A Crowd

Last year, a facial recognition system was used on live video from surveillance cameras at the European Games, in Baku, Azerbaijan, according to the tool's developer. During the June 2015 event, organizers watched a webpage that could issue an alert if a face in the crowd matched that of an individual on a watch list, explained John Waugaman, president of Tygart Technology, the company that deployed the technology.

A match that scores above a certain level of confidence will generate an alert.

Sometimes, the vast amount of faces in a highly-populated area can bog down the scanning process, so high-performance computers are available for support on the back-end, said Waugaman, whose customers include the U.S. intelligence community, Pentagon and law enforcement agencies.

If an agency needs to screen more faces per minute, for example, it can grab more computing bandwidth from cloud computer providers.

The IARPA research will focus on creating a scalable framework that can function in an open cloud environment, the synopsis states.

The scope of the intelligence project—the intertwining of real-time person identification, object recognition and activity detection—is the next wave of video surveillance, Waugaman said.

"Easily within the next two years, you’ll see pairing of facial and object recognition in operational use," he said.

With "accurate object detection capabilities, you can broaden the use cases from known subjects to just people that are behaving oddly,"Waugaman said. Instead of the software program "being trained to find faces, it’s trained to find people with backpacks or it’s training to find people carrying guns.”

There could be backlash surrounding the use of activity recognition software from privacy or gun rights groups, Waugaman acknowledged.

But perhaps paradoxically, combining all the identification modes could cut the number of false alarms, also called “false positives.”

“We might know that the person is a threat," Waugaman said. "They have an object that might be a threat and they are acting in a manner that appears to be threatening. That will really help reduce the false positive rates in these systems."

NEXT STORY: Drone-Helicopter Teams Performing ‘Very Well’ Against ISIS