Pentagon Researchers Test 'Worst-Case Scenario' Attack on US Power Grid
Over 100 people gathered off the tip of Long Island this month to roleplay a cyberattack that takes out the U.S. electric grid for weeks on end.
Plum Island, N.Y. – The team of grid operators had spent days restoring power when a digital strike took out one of two operational utility stations. The other utility was also under attack.
A month had passed since all power in the region was taken down by a devastating cyberattack. It had been a grueling six days restoring power across two electrical utilities and to the building deemed a critical national asset by the Secretary of Energy.
The cyber strike hadn’t forced the team back to zero, but it wasn’t far from it.
Just moments ago, the two electric utilities had been working in concert, delivering reliable and redundant power to the critical asset. Now one utility was down for the count and the other was under attack.
The grid operators’ only chance to restore power to the asset would be to route it, substation by substation, from the utility that was still operating. The team of cybersecurity researchers assisting the grid operators would have to use every piece of technology and know-how they had to ensure that utility stayed powered up, trustworthy and malware-free.
The Defense Advanced Research Projects Agency exercise, which took place from Nov. 1 to Nov. 7, was fictional, but it was designed to mimic all the hurdles and uncertainty of a real-world cyberattack that took out power across the nation for weeks on end–a scenario known as a “black start.”
To add realism, the exercise took place on Plum Island, a federal research facility off the north fork of Long Island, where DARPA researchers were able to segregate a portion of the island on its own electric grid.
Over the course of the seven-day exercise, more than 100 people gathered on the island, filling every necessary role to mimic an actual black start.
At the center of the exercise was a team of grid operators from electric utilities across the nation, which was in charge of restoring and sustaining power.
At its most basic level, their job involved creating initial power transmissions at both utilities using a diesel generator, then building cyber-secure “crank paths” through a series of electric substations that would increase the transmissions’ voltage until they were capable of powering the two utilities and delivering redundant power to the exercise’s critical asset.
Meanwhile, another team of DARPA-funded cyber researchers from seven different industry groups used custom built technology to keep the grid operators’ efforts protected from cyber adversaries.
A third DARPA-funded team took the role of the cyber adversaries, throwing a wrench into the good guys’ efforts every time they seemed to be getting ahead.
“We have a bunch of things that try to make this as painful as possible for everyone,” project leader Walter Weiss told reporters on a rainy Tuesday, the sixth day of the exercise. “How do you actually keep the smartest people in the world busy for a week? That takes effort.”
Try, Try Again
The Plum Island exercise is the fourth black start exercise led by DARPA’s Rapid Attack Detection, Isolation and Characterization Systems, or RADICS, program, which Weiss leads. The first two exercises were conducted in research labs. The third one took place on Plum Island but on a smaller scale and without public observers.
DARPA plans to continue the exercises every six months until the RADICS program expires in 2020, Weiss said. After that, hopefully, the project will continue under the Energy Department or another federal agency, he said.
The RADICS exercise doubled as the second phase of an Energy Department exercise called Liberty Eclipse. The first phase of that exercise, which took place in October, was a tabletop exercise during which government and industry officials game planned policy options after a massive cyberattack against the grid.
That exercise ended with the fictional president declaring a grid emergency and the energy secretary using a power first formalized earlier this year to issue emergency orders to get the grid back up and running.
One of those orders—to get redundant power to the critical asset on Plum Island—marked the beginning of the on-island exercise this month.
While Weiss and project organizers pushed for realism in the exercise, they kept some details vague. The utilities were dubbed simply Utility A and Utility B. The scenario doesn’t name the U.S. adversary that launched the grid-crippling cyberattack. Nor does it identify the “critical asset” that grid operators must keep running.
In a real-world attack, that critical asset might be a hospital, a military base or any other building that’s critical for the nation’s functioning during an emergency.
In the exercise, the asset was an aged brick building outfitted, on an upper level, with five multi-colored air dancers—the colorful, fan-powered, headbanging nylon tubes that often adorn car dealerships and cellphone stores.
Weiss described the air dancers as “high visibility power indicators.” When the asset was receiving power, the dancers would do their thing and the grid operators, observing from a distance, could breathe easy. If the dancers started slouching, they knew something was wrong.
A Very Particular Set of Tools
The cyber researchers, who hailed from the National Rural Electric Cooperative Association, BAE Systems, Perspecta Labs and elsewhere, brought three main types of technology to the DARPA exercise:
- Tools that provide situational awareness about what portions of the grid cyberattackers had infected with malware and which parts remained secure.
- Tools that isolated healthy parts of the grid so they couldn’t be infected.
- Tools that assessed and diagnosed the nature of the cyberattack that brought the grid down.
The researchers' primary focus was testing, communicating about and bypassing infected parts of the power grid without creating any digital connections that could carry malware infections into the tools themselves or into post-attack portions of the grid.
Their situational awareness tools, for example, ignored digital signals from the grid and relied on basics physics tests that are impossible to hack. Their cellphones and other communications systems operated on local networks that were segregated from the internet and broader telecom networks.
The goal wasn’t for the tools to compete against each other, Weiss said, but to test how effectively researchers and grid operators could use the tools after a truly devastating cyberattack.
In some cases, the tools didn’t perform as planned. In other cases, they worked well, but didn’t provide information in a format that was most useful to grid operators, Weiss said. That’s feedback the teams can use to rejigger their tools for the next exercise in six months, he said.
In other cases, the tools worked but were stymied by other factors that might also affect a real-world grid attack.
Researchers readied a weather balloon, for example, that could fly 500 feet above the island and detect acoustic hum and other indicators of where electricity was and wasn’t flowing properly. When reporters visited on the sixth day of the exercise, however, the balloon was grounded by persistent rain.
Earlier in the exercise, researchers spent an entire day chasing what they believed was a red team cyberattack but was actually just an anomaly in grid operations, Weiss said.
“It was just a giant false positive for a day,” he said. “If you take a bunch of researchers and stick them on an island like this, they’re going to get pretty paranoid.”
Finally, many times the tools worked effectively but needed the researchers, who were based in nearby Orient Point, Long Island, to go out and tinker with them or to help the grid operators troubleshoot, Weiss said.
In the exercise, that meant a delay of an hour or two while researchers waited for the next ferry to the island and made their way to the utility or substation. In a real-world black start, however, that could mean a wait of days or more while a too-small cadre of harried cyber experts zipped from place to place.
Weiss’s challenge for the cyber researchers, he said, is that their tools should be so user-friendly by the final exercise in 2020 that grid operators—or anyone else without specialized cyber training—will be able to use them to re-establish power by simply reading a manual.
In a real-world grid attack, for example, National Guard units might be deployed to re-establish power to specific assets or to restart power in specific sectors, Weiss said.
And There Was Light
By the end of the seventh day, despite ongoing ransomware and other cyberattacks and the loss of power at Utility B, grid operators were able to re-establish power at the critical asset, Weiss told Nextgov in an email after the exercise.
DARPA’s main research focus for the exercise wasn’t the grid operators’ success or failure, however, but how well the tools withstood various impediments and assaults by the red team of cyberattackers, Weiss said.
If the grid operators and cyber researchers were over-performing, the red team would automatically throw something more difficult at them, Weiss said. That meant the grid operators were nearly foreordained to meet their goal by a whisker’s margin.
The red team socked away about 10 days of mischief for the seven-day exercise, Weiss said, so it could match the grid operators’ and researchers’ best work and still have something left over for the next exercise in six months.
“Our goal is to be dynamic,” he said. “We don’t want them to be perfect. We want to find the limits of the tools. We’re driving them to a point where we see how far they can get and then we beat them back down.”
That may sound sadistic, but it mirrors what grid operators and their cyber helpers are likely to face in a real-world massive attack by a U.S. adversary.
“If you look at advanced persistent threats, they get more tools, they don’t get less,” Weiss said, using a common phrase for highly skilled nation-state-backed hacking teams from Russia, China, Iran and elsewhere.
If the tools can withstand that sort of battering, Weiss said that means they can be useful in less extreme situations.
“We exercise with that absolute worst-case scenario where everything’s gone wrong, everything’s failed for a month and ask how are our tools still relevant,” Weiss said. “If we can prove a tool works when everything else is broken, that gives us more confidence.”