A Global Hawk flies over Edwards Air Force Base.

A Global Hawk flies over Edwards Air Force Base. U.S. Air Force photo by Bobbi Zapka

The Pentagon’s AI Ethics Draft Is Actually Pretty Good

By seeking reliable, governable, traceable technology, the Defense Department could help set global standards for using artificial intelligence.

The Pentagon leapfrogged Silicon Valley on Thursday — at least in terms of ethical guidelines for the development and use of artificial intelligence by prominent organizations. 

The much-anticipated draft document released on Thursday by a Pentagon advisory group goes beyond similar lists of principles promulgated by big tech companies. If the military manages to adopt, implement, and follow the guidelines, it would leap into an increasingly rare position as a leader in establishing standards for the wider tech world.

After Pentagon leaders asked the Defense Innovation Board to draft a list of principles late last year, the board enlisted “human rights experts, computer scientists, technologists, researchers, civil society leaders, philosophers, venture capitalists, business leaders, and DoD officials,” including representatives from Facebook, Microsoft, Google and other similar outfits. The Board voted to adopt the draft on Thursday.

“AI is expected to affect every corner of the Department and transform the character of war,” the draft document says. “Maintaining a competitive advantage in AI is therefore essential to our national security.” Therefore, the board “recommends five AI ethics principles for adoption by DoD, which in shorthand are: responsible, equitable, traceable, reliable, and governable.” 

Responsible is the most straightforward: “Human beings should exercise appropriate levels of judgment and remain responsible for the development, deployment, use, and outcomes of AI systems.” That’s in keeping with established Defense Department doctrine going back to 2012.

Equitable as in, avoiding “unintended bias” in algorithms that might be used in combat or elsewhere. This one prescribes care in constructing datasets used in machine learning, lest — as frequently happens — biased datasets produce skewed algorithms that are, for example, racist or contribute to racial bias.

Traceable: technicians in the Defense Department must be able to go back through any output or AI process and be able to see how the software reached its conclusion. The Department describes the need here as “auditable methodologies, data sources, and design procedure and documentation.”

Reliable: Defining a “explicit, well-defined domain of use,” and lots of rigorous tests against that use.

Governable: built with the “ability to detect and avoid unintended harm or disruption.” The software has to be able to stop itself if it sees that it might be causing problems.

Those guidelines in and of themselves aren’t too surprising. But the accompanying draft white paper goes into depth in all of the different considerations and guideposts for each point. 

The document drafters, and the different parties that contributed insight to the white paper, wrestled with each point to arrive specific language, according to Heather Roff, a researcher involved in the creation of the draft who spoke to Defense One.

The traceability guideline, for instance, suggests that the Defense Department must able to actually audit the AI software it produces and uses. And “not just auditing data, but who has access to the data, are the models being used for something else? If so, then what is the re-use?” said Roff.

That’s key because as different services, combatant commands, and even individual units or operators develop AI software for specific projects, that software may well be useful for other missions or activities. The guideline suggests that there needs to be not just a process for sharing, but inspection and scrutiny before each instance of sharing to determine if sharing is really appropriate and consider unintended consequences. 

The governability guideline, in the words of Roff, is a “failsafe.” That’s intended to ensure not just that humans can maintain control of process where AI software is present but also guard against unintended consequences of shutting down AI programs, as well as trying to foresee the effects of specific programs across the Department, even if only a small portion of the military is using the system. It’s about the “layered aspect of these systems,” said the researcher. “If one layer fails, it's about graceful degradation, and designing for that.” 

The reliability guideline is going to present the biggest challenge for the Defense Department as it suggests an entirely new regime for testing new software for safety. That’s something that will require time and investment.  “There is no way to do testing and evaluation of these [new AI tools] as we have in the past. The idea of speeding up the process to get systems fielded quickly, these are going to be really important stumbling blocks,” said Roff.

In drafting these guidelines, the Defense Department is following in the footsteps of tech companies like Google, Microsoft, and Facebook, all of which have published their own ethical principles for AI. Some of the points, like reliability, are even similar. 

These corporate endeavors have seen mixed success. Google, for instance, after publishing its guidelines and convening an AI ethics board in April to determine how well the guidelines were being followed, dissolved the ethics board in a week. 

The big problem with corporate ethics lists, according to Ben Wagner, founder and director of the Centre for Internet & Human Rights at European University Viadrina, is they’re voluntary. “This makes it very easy for companies to look and go, ‘That’s important,’ but continue with whatever it is they were doing beforehand,” Wager told The Verge in April. “Most of the ethics principles developed now lack any institutional framework,” 

Unlike DoD’s new draft list, for example, most corporate guidelines don’t have restrictions on sharing tools across the institution, or insist on traceability of results, or good governernance. They read more like marketing statements, attempts to show that the company has a good, moral culture and therefore they’re using AI in a responsible way. They’re not really maps for what to do and what not to do so much as an argument to allow the company to continue using AI the way it’s already decided to do so. In the case of Google, the guidelines are just 882 words, shorter than this article. Yet Google engineers protested heavily when they learned the company was involved with the Defense Department in a pathfinding AI effort for intelligence dubbed Project Maven. In theory, the Defense Department's ethics list should relieve some concerns from the programming community about working with the Department.

The Defense Department is in a better position to implement its guidelines, working them into the actual decision cycle for building and buying AI precisely because the Department is at the very beginning stages of embracing the new technology, so it doesn’t have to design guidelines to build in loopholes for already existing practices. Incorporating guidelines, standards, and, thus, principles into the development of AI across the Defense Department is what the new Joint Artificial Intelligence Center was set up to do. So the institutional framework is already present as the tools that are coming online. Hopefully, that will ensure that the guidelines really are part of the process of using AI for defense, not an afterthought. 


Of course, there may well come a time where Defense Department leadership feels constrained by all of these suggestions. That, after all, is what a guideline is. They would be applied internally, by the Defense Department on the Defense Department. The Defense Department isn’t in the business of losing wars and neither China nor Russia have anything similar. 

But what the principles document shows is, if the Pentagon does wind up using AI in dangerous ways, it won’t be because the Defense Department didn’t have good guidelines, it will be because the Department didn’t follow them.

Don't miss: