16 November 2019

EXCLUSIVE Pentagon’s AI Problem Is ‘Dirty’ Data: Lt. Gen. Shanahan

By SYDNEY J. FREEDBERG JR.

The military has all the data it needs to train machine learning algorithms for war – somewhere. Now the Joint AI Center has to find it all and clean it up. The goal: AI Ready data.

CRYSTAL CITY: “Some people say data is the new oil. I don’t like that,” the Defense Department’s AI director told me in his office here. “I treat it as mineral ore: There’s a lot of crap. You have to filter out the impurities from the raw material to get the gold nuggets.”

Lt. Gen. Jack Shanahan learned this the hard way as head of the much-debated Project Maven, which he led for two years before becoming the founding director of the Joint Artificial Intelligence Center last year. The lessons from that often-painful process – discussed in detail below – now shape Shanahan’s approach to the new and ever-more ambitious projects the Defense Department is taking on. They range from the relatively low-risk, non-combat applications that JAIC got warmed up with in 2019, like predicting helicopter engine breakdowns before they happen, to the joint warfighting efforts Shanahan wants to ramp up to in 2020:


Joint All-Domain Command & Control: This is a pilot project working towards what’s also called Multi-Domain C2, a vision of plugging all services, across all five domains — land, sea, air, space, and cyberspace — into a single seamless network. It’s a tremendous task to connect all the different and often incompatible technologies, organizations, and cultures.
Autonomous Ground Reconnaissance & Surveillance: This involves adding Maven-style analysis algorithms to more kinds of scout drones and even ground robots, so the software can call humans’ attention to potential threats and targets without someone having to watch every frame of video.

Operations Center Cognitive Assistant: This project aims to streamline the flow of information through the force. It will start with using natural-language processing to sort through radio chatter, turning troops’ urgent verbal calls for airstrikes and artillery support into target data in seconds instead of minutes.

Sensor To Shooter: This will build on Maven to develop algorithms that can shrink the time to locate potential targets, prioritize them, and present them to a human, who will decide what action to take. In keeping with Pentagon policy, Shanahan assured me, “this is about making humans faster, more efficient, and more effective. Humans are still going to have to make the big decisions about weapons employment.”

Dynamic & Deliberate Targeting: The idea here is to take targets (for example, ones found by the Sensor To Shooter software) and figure out which aircraft is best positioned to strike it with which weapons along which flight path – much like how Uber matches you with a driver and route.

“The data’s there in all the cases I described, but what’s the quality? Who’s the owner of the data?” Shanahan said. “There’s a lot of proprietary data that exists in weapons systems” – from maintenance diagnostics to targeting data – “and unlocking that becomes harder than anybody expected. Sometimes the best data is treated as engine exhaust rather than potential raw materials for algorithms.

“What has stymied most of the services when they dive into AI is data,” he said. “They realize how hard it is to get the right data to the right place, get it cleaned up, and train algorithms on it.”

 Today’s military has vast amounts of data, Shanahan said, but “I can’t think of anything that is really truly AI-ready. In legacy systems we’re essentially playing the data as it lies, which gets complicated, because it’s messy, it’s dirty. You have certain challenges of data quality, data provenance, and data fidelity, and every one of those throws a curve ball.”

While the Pentagon needs solid data for lots of different purposes, not just AI, large amounts of good data are especially essential for machine learning. Fighting wars is only going to get more complex in the future: Military leaders see huge opportunities to use AI to comb through that complexity to make operations more efficient, reduce collateral damage, and bring the troops home safely.

Lessons From Maven: Show Me The Camel

Project Maven showed Shanahan just how hard the data wrangling could get. The aim of Maven was to analyze huge amounts of drone surveillance video that human analysts couldn’t keep up with, training machine-learning algorithms to recognize hints of terrorist activity and report it.

“We thought it would be easier than it was, because we had tens of thousands of hours of full motion video from real missions,” Shanahan told me. “But it was on tapes somewhere that someone had stored, and a lot of the video gets stored for a certain amount of time and then gets dumped. We had to physically go out and pick tapes up.”

While the military data was patchy and dirty, open-source image libraries and other civilian sources were too clean to teach an algorithm how to understand a war zone, Shanahan told me. “If you train against a very clean, gold-standard data set, it will not work in real world conditions,” he said. “It’s much more challenging — smoke, haze, fog, clouds — fill in the blank.

“Then you have the edge cases, something that is so unusual that you just didn’t have enough data to train against it,” Shanahan said. “For example, we may not have had enough camel imagery.” That sounds comical – until the first few hundred times your algorithm glitches because it can’t figure out what this strange lumpy object is that it’s seeing from 10,000 feet overhead.

Even once you had the data in usable form, Shanahan continued, you needed humans to categorize “tens of thousands, if not millions, of images” so the algorithm could learn, for example, what camels look like as opposed to pickup trucks, people, buildings and weapons. Machine learning algorithms need to see millions of clearly-labeled examples before they can figure out how to deal with new, unlabeled data. So it takes a huge amount of human labor, doing tasks that require little intelligence, to get the data in a form the machine can actually learn from.

On Maven, intelligence community analysts helped with data labeling a lot. The Intelligence Systems Support Office down in Tampa, near Special Operations Command’s SOFWERX, even spun off a dedicated subunit just to support Shanahan. (This Algorithmic Warfare Provisional Program Activity Office now helps JAIC as well).

Even so, manpower was a problem. “We never got the numbers we needed, so we had to get contractor support,” Shanahan said. Unlike a commercial company outsourcing data-labeling to, say, China, the Defense Department had sensitive operational information that could only be worked on by US nationals with security clearances. And before handing the video to the cleared contractors, Shanahan said, “you had to get rid of some sensitive things and some extreme potentially graphic things you didn’t necessarily want data labelers to look at.”

All told, it was a huge amount of work – and it’s never really done. “When you fly it for the first time, the algorithm is going to find things you didn’t train it on,” Shanahan said. “They’re constantly updated through what we call dynamic retraining.”

Even civilian algorithms require continual tweaking, because the world keeps changing. And many military algorithms have to deal with an adversary who’s actively trying to deceive them. The cycle of countermeasure and counter-countermeasures is as old as warfare, but the rise of machine learning has spawned a whole science of adversarial AI to deceive the algorithms.

“We learned in Maven, even if you fielded a decent algorithm, if you don’t update that algorithm in the next six months, [users] become cynical and disillusioned that it’s not keeping up with what they’re seeing in the real world,” Shanahan told me. Today, after much streamlining of processes, Maven is updated regularly at a pace unobtainable even a year ago, Shanahan said, but it’s still far short of the almost-daily updates achievable in civilian software.

SOURCE: Army Multi-Domain Operations Concept, December 2018.

Beyond Maven: AI For Joint Warfighting

Maven solved its problems – mostly. The head of Air Combat Command has publicly said he doesn’t entirely trust its analysis, not yet, and Shanahan himself admitted its accuracy was initially about 50-50. But the entire basis for Maven was to deliver initial capabilities – a minimum viable product – as quickly as possible to the field, then get real-world feedback, improve it, field the upgrade, and repeat.

But the tools for tackling full motion video don’t necessarily translate to other tasks that the new Joint AI Center is taking on.

Even when JAIC is seeking to apply Maven-style video analysis to other kinds of surveillance footage, the algorithms need to be retrained to recognize different targets in different landscapes and weather conditions, all seen from different angles and altitudes through different kinds of cameras. “You can’t just train an algorithm on electro-optical data and expect it to perform in infrared,” Shanahan said. “We tried that.”

And many of JAIC’s projects don’t involve video at all: They range from predicting helicopter engine breakdowns and using natural-language processing to turn troops’ radio calls for air support into unambiguous targeting data.

This is another reason why Shanahan prefers to think of data as mineral ore rather than petroleum, he told me: “It’s not fungible like oil is.” You can think of full motion video, for example, as palladium: an essential catalyst for some applications, irrelevant for others. And like rare minerals, all the different kinds of data are out there – somewhere – if you can find them, get permission to exploit them from whoever currently owns them, and separate them from the junk that they’re embedded in.

There’s no simple silver bullet solution, Shanahan said. Some suggest rigorously imposing some kind of top-down standard for formatting and handling data, but he argues the Defense Department has too many standards already, and they are inconsistently applied.

“There are a lot of people who want to just jump to data standards. I don’t,” he told me. “Every weapons system that we have, and every piece of data that we have, conforms to some standard. There are over a thousand different standards related to data today. They’re just not all enforced.”

“It’s less a question of standards and more of policies and governance,” he told me. “We now have to think about data as a strategic asset in its own right. Now, a much better approach to drive interoperability is to start with a discussion of metadata standards that are as lightweight as possible, as well as a Modular Open Systems Architecture. Or put another way, we need to agree on the definition of ‘AI Ready’ when it comes to our weapon systems.”

That includes getting acquisition program managers, traditionally focused on the physical performance of the weapons they are developing, fielding, and sustaining, to consider data as “part of the life-cycle management process just as much as the hardware is,” Shanahan said. “I see signs of the services beginning to have that conversation about future weapons systems.”

The fundamental issue: “The Department of Defense is different from Amazon, Google, Microsoft, which were born as digital companies,” he said. “The Department of Defense was not. It started as a hardware company. It’s an industrial age environment and we’re trying to make this transformation to an information-age, software-driven environment.”

One of JAIC’s key contributions here will be to build a “common foundation” that pulls together usable data and proven algorithms from across the Defense Department for any DoD user to access and apply to their specific needs. (This will require a DoD-wide cloud computing system, he noted).

“We want to have APIs [Application Program Interfaces] that allow anyone to come in and access our common foundation or platform. We will publish API definitions, what you need to write to,” Shanahan said. But the sheer diversity of the data and the different purposes it can be put to, he said, means that “there is never going to be a single standard API.”

Likewise, he said, while there will “minimum common denominator” standards for tagging metadata with various categories and labels, “you will have lots of flexibility for mission-specific tagging.”

It’s a tremendous task, but one with equally tremendous potential benefits. Working with Chief Data Officer Michael Conlin, “we are trying to fix all sorts of problems with data across the Department of Defense,” not just for AI, Shanahan told me. “I am optimistic.”

“AI will likely become the driving force of change in how the department treats data,” Shanahan told me. “And technology is changing so fast that the painful data wrangling processes we endure today may well be transformed into something entirely more user-friendly a year from now.”

No comments: