Major San Francisco power outage caused by one breaker in one substation – what are the implications

At approximately 9am on April 21, 2017, ONE breaker failed in PG&E’s Larkin Street substation. This ONE breaker in ONE substation brought the city of San Francisco to its knees.  The implications are numbing, particularly considering all of the promises PG&E has made to the California PUC after the San Bruno natural gas pipeline rupture.

Given the walled enclosure, a physical attack such as the rifle attack against the PG&E Metcalf substation would not be possible. However, a cyber attack could certainly succeed. Not only could it succeed, a cyber attack could cause physical damage to the targeted breaker and/or transformer.

The Larkin Street substation is by NERC’s definition a transmission substation as it is more than 100KV (as seen by the figure readily available on the Internet, it is a 115 KV substation). NERC CIP Version 5 generally considers substations “Low Impact” unless they support transmission facilities at 500KV or higher, or operate between 200-499KV at a single substation.  As the Larkin Street substation is considered to be LOW IMPACT, there is little claimed need to address cyber security.

The breaker that failed was scheduled to be replaced next year. This is a similar story to the 2003 Northeast Outage when the SCADA system at First Energy had a known software bug and was to be replaced but wasn’t replaced in time and therefore “helped” cause the 2003 outage.

The PG&E breaker failure led to an outage affecting approximately 96,000 customers (per DOE Energy Assurance Daily Report). However, it should be noted that a large high-rise with 10,000 people can be considered one customer. Consequently, a better number of people affected by Friday’s outage is probably closer to 500,000 or more. The power outage disrupted San Francisco’s normally bustling financial district, home to many major multinational banks and technology companies. Wells Fargo closed 13 bank branches and four office buildings, while the New York Stock Exchange said its ARCA options trading floor in San Francisco was briefly unavailable. Employees in Goldman Sachs' financial district office were sent home. Baker Avenue Asset Management was in the middle of a trade when all of their systems went down. Employees in another state had to complete the transaction.  Approximately 25% of the traffic lights in San Francisco were affected and traffic was a disaster all day with pedestrians at much higher risk than normal. The pedestrian issue was exacerbated when the people left their buildings (at least those that were not trapped in elevators) and businesses shut down sending employees home, making the traffic mess even worse. All of this impact from the failure of ONE breaker in ONE LOW IMPACT substation.

When asked by a KCBS radio reporter how important is the Larkin Street substation, the PG&E representative’s answer was that all PG&E substations are equally important. The Aurora vulnerability can damage or destroy large, critical long lead-time equipment that could result in extended outages. Additionally, the information on Aurora was declassified by DHS making it publicly available. Several years ago, DOD proposed to do an Aurora hardware mitigation project with PG&E, but was not done. We do not know if the Larkin Street substation and other PG&E substations are susceptible to Aurora or have installed the requisite Aurora hardware mitigation.

Generally, older breakers need more maintenance than modern digital breakers. NERC should intensify its enforcement of its Protection and Control Standards (PRC) specifically PRC-005 Protection System Maintenance and PRC-019 Coordination of Generating Unit or Plant Capabilities, Voltage Regulating Controls, and Protection.  These standards were cited as part of a group of NERC’s most violated standards in its Compliance Monitoring and Enforcement Reports for 2016.

The outage impact from the breaker failure at the Larkin Street Substation calls into question PG&E’s transmission planning and its compliance with the NERC Transmission Planning Standards (TPL) concerning contingency planning through redundant breakers and transmission feed capabilities. The impacts from this outage should raise red flags at DOE, FERC and NERC about classification of Critical Infrastructure and Key Resources (CIKR) and Bulk Electric System (BES) assets concerning classification going beyond voltage bright-lines and KVA classification. This outage demonstrates that system reliability, key facilities, and economic impact should be considered during CIKR classification. The outage also demonstrates the limitations of several key NERC standards. The NERC BES Exception Request provides exceptions for defining a transmission asset - . Exceptions E1 and E3 can be used to exclude substations over 100KV from being defined as transmission and therefore having to meet transmission requirements. This distinction should be of great interest to FERC and the California PUC as to whether the Larkin Street substation has been defined as transmission or distribution (it is shown as a transmission substation on the above map). The NERC CIP cyber security standards specifically exclude distribution from cyber security standards.

I will be giving a paper in June 2017 in San Francisco at the American Nuclear Society Conference on the Implications of the Ukrainian Cyber Attacks to Nuclear Plants. The paper concerns the Ukrainian cyber attackers remotely controlling the substation breakers. As a version of the malware from the Ukrainian cyber attacks has been in the US grids for years, cyber threats to substation breakers are a very real threat. Once an attacker has access to the substation breakers, potential damage to the substation and supported facilities is not far off including Aurora threats.

Joe Weiss