Gilmore: Major Weapon Acquisitions Can't Be Fixed Overnight
At a time when two-thirds of the Pentagon's major weapon programs are behind schedule and over budget, the release of J. Michael Gilmore's annual report to Congress can be as welcome as a skunk at a lawn party.
Gilmore's response: Don't shoot the messenger.
As the Defense Department's director of operational test and evaluation, Gilmore is required by law to provide an independent assessment of the performance of major weapon systems. His findings might be bad news for some programs, but as he points out, the first step in correcting a problem is to identify the causes of the problem.
"My office has to make certain that DoD leadership, Congress and military users understand what major weapon systems can and cannot do, what the problems are, the operational implications of those problems, and prioritize resources to fix those problems," Gilmore tells National Defense in an interview.
The Pentagon has come under renewed political pressure to shake up its acquisition process and lower the cost of weapon systems, which heightens the importance of testing, he says. "Defense Department systems are complex. It should come as no surprise to anyone that it can take a long time to get them to work."
Many successful programs along the way experienced false starts and problems in operational tests. Whereas earlier "developmental tests" are done in labs and controlled environments, operational tests and evaluations are realistic live-fire drills that are mandated by law and must be performed before any weapon systems goes into full-rate production. In his next annual report due in January, there will be a litany of programs that did not perform as expected. "Does that mean programs are failing? No," he says. "History clearly supports that."
Even programs that live in perpetual procurement purgatory like the F-35 joint strike fighter eventually break free. "The F-22 fighter took two decades to field. We are still working on upgrades," Gilmore says. "JSF will be around for 30 to 40 years. We'll continue to work on it, and there will be many problems discovered. It should be no surprise."
Congress created the office of the director of operational test and evaluation in 1983. The director is appointed by the president and confirmed by the Senate. DOT&E currently employs 80 government civilians and 20 military officers.
Testers historically have had a tense relationship with the acquisition bureaucracy. Before DOT&E existed, program offices had more direct control of test reports. Some acquisition offices and contractors view DOT&E as a nemesis whose reports make executives run around with their hair on fire.
Gilmore insists that his job is not to kill programs, but to inform the decisions makers. "The purpose of my office is to highlight problems in a straightforward way," he says. "People can decide how important they are and how to fix them."
There is no evidence that major programs have been canceled because DOT&E declared them ineffective, Gilmore says. "Sometimes that happens." If the problems are too severe, the Pentagon could decide to terminate a program. "I don't make those decisions," he says. "We don't engage in rationalization of the problems. We don't try to rationalize their significance."
Gilmore says his office gets unfairly blamed for things it does not control. As Congress prepares to once again consider proposals to reform the Defense Department's acquisition process, Pentagon officials have suggested that changes might be needed in weapon testing and evaluations.
Undersecretary of Defense for Acquisition, Technology and Logistics Frank Kendall wants tests to be conducted earlier in the development cycle. In his view, operational tests identify problems so late in the process that they become cost prohibitive to fix. Earlier tests, Kendall says, could help the Pentagon catch problems before the military sinks huge amounts of money into a program. This would help avert expensive redesigns and modifications — a costly lesson the Pentagon learned over the past decade from the F-35 fighter and other programs.
Gilmore says he support Kendall's initiative. "It's common sense." But he cautions against taking it too far. Programs go through developmental testing in their early stages. Operational tests require a fully assembled prototype that can function in combat-like conditions.
If the Defense Department wants operational tests to occur earlier in the schedule, it will need to have "production representative" systems before the low-rate production milestone, says Gilmore.
Typically that does not happen. He suggests that, in advance of operational tests, program managers conduct unofficial evaluations known as "operational assessments" that can give them an early sense of what might happen in OT. There is no requirement in the law to do operational assessments, he says, but nothing in the law precludes them.
Gilmore warns that moving up tests schedules alone will not accomplish much if earlier developmental tests are not thorough enough. "Developmental testing is one the first places that suffers when programs run into schedule and cost problems," he says. "That shows up when we get to operational testing."
Gilmore's website is full of examples of programs that were technologically immature and as result, many problems were discovered in operational testing for the first time. "That is very late in the process," he says. "The issue is that sometimes there is inadequate developmental testing."
Any discussions about changes in test regimens stir suspicions that the Pentagon will cut test budgets in the name of efficiency. Gilmore has resisted suggestions that the cost of tests causes programs to run over budget. His office in August posted a presentation called "Reasons Behind Program Delays: 2014 Update" that seeks to discredit the accusations.
Infighting between program managers and testers is par for the course at the Defense Department. Kendall's predecessor Ashton Carter commissioned an independent team in 2011 to probe complaints that developmental and operational testing led to cost and schedule slippages in programs. The investigation failed to prove that tests were to blame.
In a speech at a recent industry conference, Gilmore reinforced that point. "How are you going to compress testing in this era of constrained budgets? I think it's a mistake," he tells the conference. "It accepts the premise that testing is driving increased cost. The facts don't support that premise."
Many of the Defense Department's current procurement woes are the result of decisions that were made long before the equipment was tested, Gilmore says during the interview.
One example is the Army's multibillion-dollar mobile communications system called WIN-T, or war fighter information network tactical. The system is about to go through its third operational test and its outcome will determine whether it can transition to full-rate production. WIN-T in earlier tests got bad reviews from the users for being too complex, unreliable and cumbersome for combat operations.
When soldiers tell testers the system is not suitable, that is a deal breaker for any program, Gilmore says. "We don't test systems to exquisite golden standards. It doesn't have to be perfect," he says. "But soldiers are smart. They can work around some problems. But others, like the great complexity of the WIN-T soldier network extension and problems with its reliability, they can't deal with."
After last year's tests, the Army was wise to make modifications to WIN-T and schedule a new round of operational tests, he says. Sometimes the military services rush programs to failure, he says.
"You should not be schedule driven, you should be event driven, and think hard if you're actually ready for the test," he says. "Program managers are always under a horrible schedule pressure, because schedule delays means additional costs. The longer it takes to fix a problem the longer the engineering pool has to be funded." There is also political pressure from contractors and their congressional backers to move systems into full-rate production in districts where hundreds of jobs might be at stake.
Gilmore also has recommended that the Pentagon revisit how system requirement are defined. That alone can set up a program for success or failure, he says. Usually requirements are written as technical specifications, but that is insufficient to ensure a system is militarily useful. Gilmore has repeatedly held up the Navy's P-8 maritime surveillance antisubmarine airplane as an example of how to not define requirements.
Under the Pentagon's procurement regulations, officials from the Joint Staff's joint requirements oversight council, or JROC, must sign off on a system's most important requirements, dubbed "key performance parameters." In the case of the P-8, none of the KPPs specified that the aircraft needed to detect and destroy submarines, he says. In operational tests last year, the aircraft showed it could fly, but it was not able to perform wide-area antisubmarine surveillance. In an test that is supposed to replicate combat conditions, says Gilmore, the aircraft needs to do much more than just fly.
"I'm an advisor to the JROC," says Gilmore. "I do make them aware of my concerns. But it's up to the JROC to set requirements."
Perhaps the most dramatic illustration of what happens when a major weapon system's requirements, procurement strategy and test plans are out of kilter is the F-35. The aircraft's mission systems have yet to be tested in the F-35, even though the program is already in production. Gilmore expects the program will move forward, albeit at a slower pace than many had hoped.
Almost every setback in the F-35 can be pinned on decisions that were made more than a decade ago, long before the current program leaders took over.
In the early days of the George W. Bush administration, the Pentagon agreed to proceed to low-rate production at the beginning of engineering development, with little to no testing. Normally, low-rate production starts after development is completed.
"The assumption was that models and simulations were so good that very limited testing would be needed either in flight sciences or mission systems in order for the plane to mature," Gilmore says.
"Those were bad assumptions. It took the department a number of years to realize that." A program restructuring in 2010 added more time and money for developmental testing.
F-35 program executive officer Lt. Gen. Christopher Bogdan has put pressure on the contractors to improve the reliability of the aircraft. Poor reliability, says Gilmore, is a direct consequence of the decision to rush the program. "It is not a surprise that the aircraft availability rates are between 30 to 40 percent in the squadrons that have production aircraft," he says. "I expect that to improve over time." These are issues that should have been worked on before the aircraft went into production, with more component-level testing. It should not shock anyone, he adds, that as a consequence of the decision to start building airplanes before key components were fully tested, the aircraft remained immature.
The F-35 program office now has to play catch-up, and live with the consequences of those early decisions, he says. "You can't test reliability at the end of the program."
Gilmore is working closely with F-35 managers in preparation for operational tests in 2018. His office had recommended the aircraft undergo an "operational utility evaluation" in 2015 after software development is completed. But Gilmore later concluded that the mission systems would not be ready. "We continue to work on plans to do formal operational testing which probably won't occur until 2018," he adds. "We're beginning preliminary work laying out some of the details of the operational test but we're still several years away."