IARPA seeks security footage for better algorithm training

The Intelligence Advanced Research Projects Activity wants 960 hours of annotated video data to help it train computer vision algorithms for law enforcement and public safety.

For computer vision and facial recognition systems to work reliably, they need training datasets that approximate real-world conditions. So far, researchers have had access to only a small number of image datasets, many of which are heavily populated with still pictures of fair-skinned men. This limitation impacts the accuracy of the technology when it comes across types of images it's not familiar with – those of women or people of color, for instance.

Another challenge is related to the varying quality of the images on video feeds available from surveillance cameras. Often the cameras' scope and angle, as well as the lighting or weather during a given recording, make it difficult for law enforcement to track or re-identify people from security camera footage as they try to reconstruct crimes, protect critical infrastructure and secure special events.

To help solve this problem, the Intelligence Advanced Research Projects Activity has issued a request for information regarding video data that will help improve computer vision research in multicamera networks. IARPA is seeking capability statements for an annotated video collection of 960 hours that includes:

  • Data collected over multiple days with varying illumination from a network of at least 20 cameras with varying positions, views, resolutions and frame rates that include both overlapping and non-overlapping fields of view.
  • Data captured over 10,000 sq. meters in urban and semi-urban environments with multiple intersections, buildings entrances/exits and pedestrian foot traffic as well as signs, vehicles, trees and other obstructions.
  • Data involving a minimum of 5,000 pedestrians and at least 200 subject volunteers given instructions on how to behave and/or where to go in the camera network.

Respondents may need to partner with a state, local or municipal government or an outside third party to use cameras or collection spaces under their jurisdiction, the RFI stated. Additionally, the dataset must be approved for human-subject research and made available to the general research community under a privacy, legal and policy approved data release process.

Sensitivity to privacy issues in image collections surfaced recently when IBM expanded the number of faces available for researchers. The Diversity in Faces dataset, an annotated collection of 1 million human facial images was an effort increase the variety of the training data for facial recognition algorithms. The company scraped images from Flickr's YFCC100M database -- the largest publicly and freely useable multimedia collection of 99.2 million photos that were shared under a creative commons license. The database came under attack when it was revealed that the people whose images were included in the IBM dataset had not been informed that their images were to be included.

Responses are due May 10. Read the full RFI here.