U.S. Airmen review Afghan guests’ medical paperwork at the Philadelphia International Airport, October 22, 2021.

U.S. Airmen review Afghan guests’ medical paperwork at the Philadelphia International Airport, October 22, 2021. U.S. Army / Spc. Ian Miller, 55th Signal Company

Inside the Data-Driven Operation that Moved Afghan Refugees from Dulles to Safe Havens

The main challenge was reconciling and processing disconnected, stove-piped, unavailable, or incorrect data.

When the phone rang for Brig. Gen. Mark Weber on Aug. 22, it wasn’t the order that the chief of staff of the Montana Air National Guard was expecting. Instead of taking charge of logistics movements as Hurricane Ida bore down on Louisiana, Weber was summoned to Washington, D.C., to lead a key part of the largest non-combatant evacuation operation in U.S. history. Kabul had fallen to the Taliban one week earlier, and the U.S. government had seen the trickle of refugees out of Afghanistan turn into a flood. Among other things, his experience would illuminate the need for better data sharing between government agencies.

The overall effort belonged to the Department of Homeland Security, but the Defense Department would be doing much of the heavy lifting. Using military aircraft and commercial airliners contracted through Air Mobility Command and U.S. Transportation Command, Operation Allies Welcome (né Operation Allies Refuge) would fly fleeing Afghans to designated lily pads—temporary receiving and vetting locations throughout the Middle East and Europe—from which they would then be transferred to longer-term safe havens throughout the United States. 

Gen. Weber’s task would be moving refugees from Dulles International Airport in Virginia to the eight U.S. military bases serving as safe havens. His team would direct the flow of aircraft and ground vehicles provided by the Air Force and Army components of U.S. Northern Command. 

He might have expected to fly to Tyndall Air Force Base, Florida, where the 601st Air Operations Center usually coordinates logistics and mobility for humanitarian assistance and disaster response in North America. Instead, he and his team swiftly established a mobility command-and-control facility in the conference room of a hotel near Dulles, where an aircraft full of Afghan refugees was already inbound. Weber and his small staff of service members would live and work 16-hour days (or more) for the next three weeks. The team would soon learn that their main challenge would be primarily one of reconciling and processing disconnected, stove-piped, unavailable, or incorrect data. 

It was critical that the number of incoming guests never exceed the capacity at Dulles or at the safe havens. 

His team needed to know the number of Afghan guests per flight, the capacity of each safe haven, the visa status of each guest, the schedule of inbound and outbound flights, and more. These data, however, existed among various databases located in systems which were not compatible with one another, such as Department of State’s flight tracker and refugee list, DOD’s dataset of flight status for mobility aircraft, Army North’s refugee tracker, Customs and Border Protection’s Arrival and Departure Information System, Northern Command passenger data, Flightradar24’s lists of participating commercial flights, National Oceanic and Atmospheric Administration weather data, and DHS’s dataset of refugee information—some classified, some not. Weber’s team strived to manually compile data, poring through rows of disparate spreadsheets, sticky notes, and disconnected databases to keep tabs on the number of incoming guests and the remaining capacity at each location. 

The team had to gather data from 15 government and non-government agencies on four continents. The compilation was necessary for the timing and coordination of aircraft and ground transportation to ensure a smooth flow of guests, some still departing Afghanistan, some flowing through lily pads, and some already en route to Dulles. Getting it wrong could have cascading effects. An underestimation could overtax the available beds, food, and medical capacities at safe havens, which would be followed by a refugee surplus at Dulles that could bottleneck operations. Overestimating the number would force aircraft to leave partially empty to prevent ramp and processing congestion. 

Building the Plane as It Flies 

On the first day of operations, USTRANSCOM reported eight Civil Reserve Air Fleet  aircraft—airliners activated for military transport during a crisis—inbound for Dulles with a total of 2,000 seats. But how many refugees would be in those seats? The team had no access to the exact passenger manifests. No common dataset existed between the Homeland Security, Defense, and State departments. The only way to learn just how many people were coming into Dulles was to spend hours making calls to lily pad locations, safe havens across the United States, and government agencies. 

Once they had solid figures, Weber’s team would coordinate with the Army team that was organizing bus transport to safe havens. Army logisticians would call each safe haven daily and sift line-by-line through passenger lists to confirm how many people were on buses and aircraft at that moment, how many might be coming in the next 24 hours, and how many people each safe haven could accept over the next day. 

Dozens of data-gathering phone calls soon turned into hundreds of hour-by-hour calls to every possible staging location, government agency, and safe haven to create a comprehensive data set, valid only for that snapshot in time. Compiled data, however, is only as good as the derived sources. The team found that often, through an abundance of caution, some locations would only report regularly approved numbers, not the most accurate moment-by-moment numbers. One location, not wanting to report inaccurate numbers, only released the approved figures on its available capacity, which could be up to four hours out-of-date, leading to a stale dataset with the potential for compounding errors in the aggregate data. 

A Data-Centric Approach 

After three days, Weber’s team and others had successfully processed thousands of guests without a shortfall in capacity. The agility of State, Defense, DHS, DoD, and non-governmental organizations was on full display. But ad hoc human-based data gathering was not a sustainable solution. It turned out that a solution existed: a growing team of Air Force, Army, and Joint Staff data experts were quietly combining the disparate refugee data sets into a cloud-based commercial software tool that could reveal the total picture of the operations. 

A year earlier, as the DOD adapted to operations under COVID-19 restrictions, the Air Force had been struggling to keep tabs on the readiness of its force and the availability of aircraft for missions. This personnel and maintenance data existed in disconnected local systems or in paper documents that airmen spent thousands of hours each day compiling for briefing slides that were outdated the moment they were compiled. Learning of an Army effort to combine stove-piped soldier and equipment data in the cloud, the Air Force followed suit. The results were immediate: real-time access for commanders into the health and readiness of their force with increasing data sets coming into the platform each week. 

The Joint Staff took notice, and began to integrate these data capabilities with those in the Army, and similar cloud-based efforts at USNORTHCOM. A coalition of data-centric warfighting capabilities was coming together under Deputy Defense Secretary Kathleen Hicks’ new data decrees for the Department and an announcement of a data accelerator program. One year later, as DoD was tasked to support arriving Afghans, the USNORTHCOM Commander told his team to help. Within 30 minutes of notification to the software company, data engineers were on a train from New York to D.C. After 36 hours of data mapping and integrations work, the cloud-based system went live, providing constantly updated metrics on incoming guests and available capacity. While the software initially integrated data from only a few agencies, new data sources were added each day. The result was a growing repository of live, interagency, and open-source data that could be curated for the needs of any agency involved in the operation. Rather than stale spreadsheet data gathered by humans every eight hours, the software increasingly automated the information gathering and drove updates down to every three seconds with data that soon became the authoritative source. 

The cloud-based software was useful for more than accurate predictions about capacity requirements, but also proved vital for finding information when challenges arose. A week into the operation, when a deaf Afghan boy was inadvertently separated from his family during his many movements, Weber’s team was able to quickly isolate his location by using cloud-based data and reunite him with his family.

By Sept. 14, Gen. Weber and his team had moved 55,655 Afghan guests without ever exceeding capacity at the various transit points. Critical to this success was the rapid application of a common data integration platform.

Looking back, Gen. Weber called Allies Welcome the most “heartbreaking, honorable, joyful, and outrageously fulfilling” operation of his career. The military and government should learn from his experience. No big problem in the 21st century should be solved by daily human efforts to enter information into spreadsheets and briefing documents. Agencies should set up commercial cloud-based data tools before a crisis arrives. The government as a whole should work much harder to establish ways to share data through clouds.