JUST IN: National Academies Report Urges Air Force to Name AI Testing ‘Champion'

By Laura Heckmann

iStock photo-illustration

A recent report scrutinizing artificial intelligence testing-and-evaluation practices within the Air Force had one key takeaway: someone needs to pilot the airplane.

The report, released Sept. 7 by the National Academies of Sciences, Engineering, and Medicine titled, “Test and Evaluation Challenges in Artificial Intelligence-Enabled Systems for the Department of the Air Force,” said the service will require “dedicated leadership, continuous oversight, and individual responsibility and accountability” to reach its AI goals.

In other words, the Air Force needs to put someone in charge — officially. The report said the service needs to formally designate a senior AI testing-and-evaluation official who reports to the secretary of the Air Force and has the necessary resources and authorities to implement Air Force-wide changes.

The report referred to this leader as a “champion,” noting the Air Force can call it what they want, as long as they choose one.

“It has to come from the top and it has to be a formal designation,” Jack Shanahan, a member of the report committee and retired Air Force officer, said during a webinar discussing the report. “We’ve got too much experience in the past of everybody thinking something is a good idea. But without the commensurate authorities and responsibilities that go along with that good idea, you just don’t get the speed or the level of change — both the breadth and depth of change — that’s needed.”

The report calls on the champion to establish an AI governance structure, which was another weakness pointed out in the report’s findings. The Air Force lacks a cohesive, overarching AI T&E framework and comprehensive requirements, it found.

Shanahan acknowledged finding an individual with the proper credentials is a “pretty big lift,” but once the person is found, “they should be granted the requisite authorities, responsibilities and the resources to ensure that AI T&E is integrated from program conception and appropriately funded.”

Some of the challenges before the champion, as outlined in the report, would include cultural and educational workforce reform, developing cohesive policies and guidelines and adapting traditional testing practices to encompass constantly changing data.

The report noted the Air Force is still in the early stages of incorporating AI into its systems and operations.

“That’s not to say they’re not doing AI, that just means they’re starting to figure out how to actually incorporate this at scale,” May Casterline, principal solutions architect at NVIDIA and report committee co-chair, said during the webinar.

The report's authors did not find evidence of “enterprise-level T&E policies in place, or infrastructure, really at the scale required to support the testing of autonomous or AI-enabled autonomous systems.”

A lack of enterprise-level guidance has led to “a lot of ad hoc methods that are … not uniform,” Casterline said. “And so there are efforts to organize some of this, but like I said, the main takeaway is not quite to the scale that they need to be at.”

She noted an “excellent track record” within the Air Force T&E community in assessing and optimizing more traditional human machine interactions, but a lack of “significant motion to adapt or pivot those approaches to tackle future AI-centric human systems.”

Human-systems integration was a focus of the report, noting that while it has been studied extensively over the past 50 years, “it is evident that the kinds of AI anticipated soon will demand a different approach to how humans learn to work with ‘smart’ machines.”

Another focus was cybersecurity and risk aversion, and the need to recognize and weigh the vulnerabilities inherent in artificial intelligence systems. Hand in hand were the need for zero trust capabilities and “justified confidence” in AI systems, which the report said will require “an entirely new type of T&E.”

The report also examined a need for a cultural change and workforce development. Shanahan said in spite of a strong history of low mishap rates, “it's time to adapt and get a new culture … that embraces everything that it had in the past but recognizes that it's time to think about a different environment with a more risk-tolerant, agile and adaptive mindset approach.”

The report ultimately broke down six areas of focus, concluding that the Air Force has historically supported a "robust set of test-and-evaluation processes and procedures,” but it needs to modernize to adapt to what’s coming. Current practices “do not fully translate to nascent and immature software capabilities, especially the ‘black box’ self-learning, adaptive, data-centric nature of AI,” the report said.

“The major finding that we have is that the [Department of the Air Force] really has a tiger by the tail — has a serious set of issues to get in front of now,” Thomas Longstaff, chief technology officer at Carnegie Mellon University and committee co-chair, said during the webinar.

“You can't wait until everything is solved. You can't wait until all the research is done. Because we are going to be putting these systems in deployed areas out there in the very near future,” he said.

Topics: Air Force News, Robotics and Autonomous Systems, Robotics

Comments (0)

Retype the CAPTCHA code from the image
Change the CAPTCHA codeSpeak the CAPTCHA code
Please enter the text displayed in the image.