New Pentagon Task Force Exploring Generative AI

By Josh Luckenbaugh

iStock illustration

The Defense Department has launched a new task force to investigate the possibility of using generative artificial intelligence such as large language models for military missions.

Established Aug. 10 by Deputy Secretary of Defense Kathleen Hicks through the Defense Department’s Chief Digital and Artificial Intelligence Office, Task Force Lima “will develop, evaluate, recommend and monitor the implementation of generative AI capabilities across the DoD to ensure the department is ready to adopt and appropriately protect against generative AI technologies,” a fact sheet stated.

The Defense Department has been researching and developing AI systems “for decades,” Navy Capt. Manuel Xavier Lugo, the commander of Task Force Lima, said in an interview. “However, generative AI,” as seen in publicly available large language models such as ChatGPT and Bard, was introduced “in a very quick fashion to the world … and it is a little bit different from traditional artificial intelligence.” The task force was created to help focus the Defense Department’s generative AI research and “shepherd all the individual investigations that are going on with this technology across the whole department,” he added.

Task Force Lima has received more than 160 use cases for generative AI from across the department, most of which are centered around administrative problem sets such as “information summarization, logistics, planning roles — those types of problems,” Lugo said. “In some of the cases, you have blank sheets, and it’s a lot easier to work from a first draft that is generated by a tool in seconds. … The other utility is also being able to point the technology to volumes and volumes of data and summarize that into paragraphs or just pages, stuff that would take analysts or specialists a lot of time to do. We can do that in seconds, and we can iterate through those responses pretty quickly.”

One potential drawback of generative AI is “hallucinations,” where the large language models essentially make something up. While hallucinations are not an impediment in every use case, “when we start using [generative AI] in more precise [types] of workflows — like, for example, planning — we do need to have a certain level of trust with the output of the technology,” Lugo said.

The task force is challenging industry to help the department with procedures, technologies or system-of-systems concepts to reduce hallucinations and is building out guidance and tools such as “test and evaluation harnesses that can be plugged into the workflows and into the models themselves to give us output as to the quality of the metrics and … the quality of the output of the particular model,” he said.

Task Force Lima, which has more than 400 personnel according to Lugo, was chartered for 18 months and has a number of deliverables, he said. The task force has written interim guidance, which is currently undergoing staffing, “to provide the DoD with left and right limits as to how to use this technology,” he said.

The task force is also analyzing generative AI use cases “in order for us to group them and in order for us to find the actual areas of employment of the technology so that we can go ahead and then start writing specific frameworks and guardrails for those particular areas of employment,” he said.

Once models are mature enough to use, the task force will collect data to evaluate the performance of the models and “come up with frameworks as to how to measure that performance so that it can be applied globally to the models and the workflows and the human-machine teaming,” he said.

Other lines of effort include investigating how the models present security risks, as well as how to ensure responsible AI across the technology, he added.

Task Force Lima is “on track” to accomplish all of its deliverables in the 18-month timeframe — however, “that doesn’t mean that all the work is going to be completed at the [end of the] 18 months,” Lugo said.

Part of the task force’s charter is to determine whom to transition some of the work to and to “come up with what is the DoD’s strategy for implementation of generative AI across the department,” he said.

A key factor in how much the department ultimately adopts generative AI will be cost, Lugo said.

“This technology is extremely expensive, and there are a lot of assumptions that we can just plug and play into our systems with it” because models like ChatGPT and Bard are available online for free, he said. “What is lost in all that is … the investments in compute to do these inferences. If we want to own our own model, there’s a lot of expense associated with that, and there’s a lot of expense in expertise that is associated with this.

“So, I think that’s going to be the largest challenge for the department … what is the actual return on investment that we’re going to get for this technology?” he said.

“Right now, everybody is in that hype of, ‘Wow, this is super cool.’ And then they’re going to get sticker shock, and that sticker shock is what’s probably going to level the technology to a more practical location.” ND

Topics: Electronics, Robotics and Autonomous Systems

Comments (0)

Retype the CAPTCHA code from the image
Change the CAPTCHA codeSpeak the CAPTCHA code
Please enter the text displayed in the image.