Navy Developing Simulations to Test Autonomous Vessels
The Navy’s increasing mission reliance on autonomy has created a unique and complex challenge: how do you test it?
Testing autonomy is a bit like the Wild West — expansive and lawless. There are no hard rules and “a nearly infinite set of situations,” Yiannis Papelis, chief technology officer at Old Dominion University’s Virginia Modeling, Analysis and Simulation Center and senior research director for autonomous systems at the Virginia Institute for Spaceflight and Autonomy, said in an interview.
The Navy — along with a host of partners — is working to answer that question through a project called the Naval Autonomy Test System. Its goal is to create a simulation framework that can not only work the nearly infinite set of situations, but trust them.
Autonomy is “one word, but a lot of times it means different things,” Papelis said. In terms of ships, it can deal with anything from maneuvering around obstacles to firing its own weapons and even collaborating with other vessels, he said.
When there is no crew on the vessel, and it has to have autonomous behavior, there is added complexity to testing performance, stability, speed and other characteristics, he said.
“You have to test this and ensure that this autonomy is capable of performing the required mission,” Papelis said. “And it turns out that this is significantly harder to test than it is to test the physical characteristics of a boat.”
Vessels bring quantifiable metrics that can be tested, such as verifying maintainable speed under certain conditions, he said. With autonomy, this is “much, much harder. And the reason is because there’s almost an infinite set of situations you can find it in.”
What all cases of autonomy have in common is a dependence on how the vessel behaves in relation to things around it — both stationary and dynamic, he said. In the human world, vessels are governed by a set of rules called “collision regulations.”
For example, a vessel approaching from the left yields to one coming from the right, Papelis explained.
“And so the whole point is how can we test that this autonomy will now provide behaviors that are acceptable and consistent with [collision regulations], and they are also safe, and they also meet the mission requirements,” he said.
In a word: sampling. Lots and lots of sampling.
There are “all kinds of techniques” for testing, and one is “simply trying,” he said. Running a scenario 300 times doesn’t guarantee the outcome of test 301, but “I’m a heck of a lot more comfortable thinking that it will succeed,” he said, as compared to “running one test, because that’s all they can afford if it’s a real boat.”
Current testing practices can be both expensive and dangerous, with limits on how often and how effectively scenarios can be carried out. Ships can’t limitlessly crash into obstacles or each other to test collision regulations. Not physically, anyway. Inside a simulation is another story.
Eric Giesberg, naval architect at the Naval Surface Warfare Center Carderock Division and program manager for the Naval Autonomy Test System, estimated as many as 500 ship collisions occur a year, and avoiding them is one of the project’s main use cases.
“Just trying to get that right — it’s going to require a lot of data, a lot of testing,” he said. “And simulation is the safest place to do a lot of this,” he said, adding that it’s arguably the least expensive way to generate autonomous data.
Developing the simulation framework has been in the works since 2020 and combines the efforts of two Naval Surface Warfare Centers and both the Virginia Modeling, Analysis and Simulation Center and the Virginia Institute for Spaceflight and Autonomy, among others. The program is funded under the Central Test and Evaluation Investment Program, which is part of the Test Resource Management Center’s portfolio.
Old Dominion’s role in the project has involved multiple, smaller projects under its umbrella, the latest being a $2 million, two-year project award in February from the Naval Surface Warfare Center at Dahlgren. That project’s objective is to develop “flexible simulation capabilities for testing U.S. Navy maritime autonomy solutions under a wide range of potential scenarios,” an Old Dominion press release stated.
A generous portion of Papelis’ work within the project deals with creating maritime traffic scenarios that challenge collision regulations.
“We can have it pretend to be giving way, but then [at] the last moment we were going to actually turn into the autonomy and basically violate the rules,” he said. “A lot of what we’re doing is building … traffic whose sole purpose in life is to make the autonomy’s life difficult. Because that’s how you want to test autonomy.”
Ironically, breaking the rules is also what builds trust. The more data, the more certainty.
If building trust in autonomy is one of the project’s biggest questions, the answer is “the holy grail,” Papelis said.
Complete trust is impossible, he said. The goal is to get as far as you can. If testing capabilities can take trust from 20 percent to 80 or 90 percent, that’s a win.
And Giesberg said simulations allow for multiple tests without having to build a ship. “If I want to run two tests in parallel, I can use a high-performance computing system to run thousands of these simulations in parallel.”
But how much data is enough to achieve satisfactory confidence?
“Throw in two boats, and now you have 300 times 300 permutations,” Papelis said. Is 10 percent of those permutations a good enough sampling? “And this is the challenge: there is an unlimited set of conditions.”
When determining what testing data is useful, cases that clearly fail or succeed are far less interesting than what Papelis called border cases: uncertainty that creates scenarios compelling enough to put a real boat in the water “and seeing what would happen under that condition.”
While every condition in an unlimited set can never be fully accounted for, the limitations removed in a simulation environment can bring it closer than it has ever been before.
“So the goal of this project is to build a simulation framework, a simulation environment, in which you can plug in different autonomy models … and build scenarios that are … rich enough to increase your confidence that this autonomy will actually be safe to use and it will meet its mission requirements,” he added.
And it’s not just about trusting autonomy, but defining what trust in autonomy looks like, Giesberg said. “We test out sailors, we send them out, but there’s the autonomy. How do we build up that trust with them?” he said.
The test system will likely never truly be “finished,” Giesberg said. It will be “constantly evolving” along with autonomy. “And the moment we can’t start finding limitations or things like that, that’s maybe when the ship will end. I don’t see that happening anytime soon.”
But that doesn’t mean there aren’t measures of success along the way.
“One of our [key performance indicators] arguably should be how quickly can we validate and how quickly can we find issues with a vehicle,” he said.
One milestone Giesberg described as personally exciting was the recent release of the program’s closed beta, which allows users to go online, draw up a scenario, choose an environment “and see what will happen to the autonomy.”
Giesberg said they’ve encouraged experimentation within the beta.
“Let’s say somebody’s like, ‘Hey, I’m a sailor, I just want to experiment with this.’ That’ll be real quick for them to do,” he said. “It has also just been really exciting to start to see people use this system and mess around with it as we’re bringing it from closed beta to” initial operational capability.
Seeing tangible interaction and enthusiasm with the program’s work was an excitement shared by Papelis, ordinarily immersed in intangibles such as research and architecture. Or, as he described it, like fish and water: “It’s there. It’s needed for survival, but nobody notices.”
Once the work reached a point of end-to-end capability demonstration, he was able to “see the light bulb.”
“People came in, we built a scenario by using the tools we developed, we ran a simulation, and we showed what happens when different autonomies are challenged under different scenarios,” he said. “What was amazing is to see the light bulb.”
He witnessed program managers, developers and programmers, “down to the guys who are effectively tasked with running specific physical tests,” realize how the program could benefit their work, he said. “So, that was very exciting for me personally, to see such an interest for the project across the board.”
Papelis has been involved with the program since its inception, and Old Dominion’s two-year project likely won’t be the end. Currently about a year into its two-year timeline, the project’s conclusion will simply make way for the next one, he said, estimating the official launch of the overall system being “several years out.”
The milestones will arguably never end in a program that doesn’t so much have an end goal as a pacing goal: evolve with the autonomy.
What that ultimately means is “architecting this in a way that we’re not limiting ourselves in the future,” Giesberg said. “We know we need to test these systems. And making sure we’re testing and being as flexible as we can and … it’s very important for the Navy, and I think for the world. We’re all learning as we’re going.” ND
Topics: Training and Simulation