Bandolier: Gold Standard, or Only Half Way There?

Bandolier: Is half way there good enough? I want to specifically respond to Ralph Langer’s comments from my blog post on Severity Levels. Ralph posted, “While I agree in general that severity cannot be established without context, experience tells me that such context can hardly be established by any kind of automated software tool. Even worse, many asset owners don't have any realistic idea, not to say methodology, of calculating the cost of potential cyber incidents. Without having seen the Bandolier product, my guess is that it goes half the way... which is better than nothing, after all. P.S. Why not discuss this stuff over at Digital Bond's?” My response: Being honest, I want to discuss it here, because this is MY turf…and frankly, the Digital Bond folks don’t seem to want to hear it. Let me show you why I feel that way. On Monday I had a phone call with Dale Peterson to determine if he would have been interested in a joint proposal (Digital Bond and Applied Control Solutions) to extend Bandolier to address the control system issues identified in the original blog post. I felt, and I still feel, that Bandolier is useful, but not addressing all the issues it could, and I wanted to help make it an even more useful tool—all that it could be. So you’re right, Ralph. Bandolier “is better than nothing.” But do we want to stop at “better than nothing?” I explained to Dale my concern that Bandolier appeared to be addressing computers and not systems. My concerns were that I thought the Bandolier approach could cost the end users significant money by having to address non-critical systems. Additionally, I thought Bandolier could provide a false sense of security by not addressing the security of the systems and facilities (why are we doing Critical Infrastructure Protection -CIP). Dale’s response was their scope was not the security of the systems. He added that he felt it would be difficult to address the control systems issues but would be happy to use whatever result I could come up with as a plug-in to Bandolier. I can’t help but feel that just because it would be difficult doesn’t mean we shouldn’t be addressing the system security issues, instead of trying to secure systems by securing individual computers. According to Dale, Bandolier is a joint effort by the control system supplier who provides the optimal operating system configuration for their new systems, Digital Bond staff which reviews the configuration, and an end-user (such as TVA). He then mentioned the purpose of Bandolier was to be the “gold standard.” I then asked if the current “gold standard” was NIST SP800-53. Dale said it was the NIST Federal Desktop Core Configuration (FDCC). Remember, the FDCC was developed for desktop operating systems not industrial control systems.  That’s what NIST SP800-53 and ISA99 are trying to do—write a “gold standard” for industrial control systems. So why are we ignoring that work, from people who actually have worked with control systems? If somebody can explain this to me in short simple sentences, maybe I won’t be so confused. I then asked Dale how Bandolier handled old, obsolete operating systems such as Windows NT4 and Windows 95. Dale said Bandolier had no files for these old systems. These older systems are still in use even with many modern DCS upgrades and often cannot be replaced – how can you ignore them??? Remember, companies like ATS (Advanced Technical Services) even still manufacture out-of-date circuit boards for industrial computers and PLCs that are 25 years old, and cannot be taken out of service…but that are networked through serial device servers on the plant floor. These systems will not go away for decades. Why are we ignoring them? I believe the purpose of control system cyber security (CIP) is not to secure a computer, but to secure a system and/or facility. The vulnerability of a computer is important only if it leads to a compromise of a process or facility. To explain why it is so important to secure systems not computers, I will provide two examples. Mark Hadley from DOE’s Pacific Northwest Laboratory (PNNL) and I created a list of myths about control system cyber security. One myth was that firewalls are enough protection. As Mark and others would say, ”firewalls are speed bumps,” particularly if appropriate rule sets are not used. The hacking demonstration performed by Idaho National Lab staff at the 2004 Control System Cyber Security Conference in Idaho Falls traversed MULTIPLE sets of firewalls prior to compromising (opening and closing) the smart relays and operator displays. Another myth is that VPNs secure networks. At the 2007 Conference in Portland, PNNL compromised OPC packets and then used a VPN to camouflage the compromised packets. The result was a compromise of a modern SCADA system (changing voltages) and the operator displays. Both of these examples demonstrate that it’s the system that needs to be secured, not the computer.  Here are my comparisons of what severity really means in the process control system context to what Digital Bond’s Bandolier Project defines severity to be. I hope this can begin to explain the differences: Severe Bandolier -This represents the most serious potential impact to the control system. A check that is non-compliant and has an internal rating of severe generally indicates that the system is at risk unless other specific mitigation measures are in place. Poorly configured directory permissions or network services, for example, can lead to system compromise and would have the severe rating. Examples: - Incorrectly configured permissions on critical system directories such as /etc/passwd - Web server of FTP server with improper user restrictions Control systems – This represents failures, omissions, or errors in design, configuration, or implementation of required programs and policies which have the potential for major equipment and/or environmental damage (millions of dollars), and/or extreme physical harm to facilities’ personnel or the public; and/or extreme economic impact (bankruptcy). Example: The Bellingham, WA gasoline pipeline rupture’s impact was 3 killed, $45M damage, and bankruptcy of Olympic Pipeline Company. The National Transportation Safety Board identified the cause as the operator using the operational SCADA system for development work. A Bandolier check would not have identified the problem to this obviously severe event. Moderate Bandolier - This category represents a variety of checks with potential control system security impact. They may not lead to system compromise in themselves, but could aid an attacker or become a more serious problem in the event of some other failure or compromise. Included in this category are items such as unnecessary services, inadequate password strength, insufficient logging, etc. Examples: - Network share that exposes sensitive control system information - Incorrectly configures security event log settings - Weak password requirements such as inadequate length or complexity requirements Control systems – This represents failures, omissions, or errors in design, configuration, or implementation of required programs and policies which have the potential for moderate equipment and/or environmental damage (tens of thousands of dollars) with at most some physical harm to facility personnel or the public (no deaths). Examples: – - Maroochy (Australia) wireless hack which caused an environmental spill of moderate economic consequence. I don’t believe this would have identified by a Bandolier check. - Browns Ferry 3 Nuclear Plant Broadcast Storm could have been caused by a bad Programmable Logic Controller (PLC) card, or insufficient bandwidth would not have been detected by Bandolier testing. Informational Bandolier - This category represents checks that may not pose a threat to the system or are simply informational in nature. These will typically identification checks that indicate the role or version of a particular control system application. Example: - Configuration file indicates that the system is serving in a particular role (e.g. historian, real time, etc...) Control systems – This represents failures, omissions, or errors in design, configuration, or implementation of required programs and policies which have the potential for minimal damage or economic impact (less than $10,000) with no physical harm to facility personnel or the public. Example: - Davis Besse Nuclear Plant cyber incident caused by a contractor plugging in a laptop contaminated by the Slammer worm into the plant Safety Parameter Display System. I don’t believe this would have been identified by Bandolier testing. I want to reiterate that ACTUAL control system cyber incidents including Bellingham, Maroochy, Browns Ferry, Hatch, the Florida Outage, etc were not caused by incorrect operating system configurations or operating system vulnerabilities. Neither were many of the other control system cyber incidents in my incident database. Is CIP protection of critical infrastructure or “critical computers”? Why do DOE and DHS continue to fund projects that do not address actual control system cyber incidents that have already occurred? In fact, why don’t they want to know about these incidents, most of which have NOT been reported to government? Why isn’t DOE funding projects like Aurora with solutions to demonstrated control system vulnerabilities? Yes, the Bandolier project is worthwhile. Yes, Ralph, we should be developing automated tools to help us secure control systems. But the Bandolier project is not designed to go all the way to where we should be. Joe Weiss

What are your comments?

You cannot post comments until you have logged in. Login Here.

Comments

  • Wow, that was an extensive response. Thanks for that one, Joe.

    Your reasoning is quite convincing, but I'm getting the impression that a dissense between you and Dale, if one exists, is mostly about the proper usage of the term severity. If we substitute severity in Bandolier's context with something like conformance, compliance or vulnerability rating, both approaches might coexist pretty well, and could probably even be integrated.

    As posted elsewhere, I don't see a high-level risk assessment tool around the corner. So why not assume that going half the way at this time is better than just waiting and not going at all. I have seen many times that waiting for some standard or do-it-all-procedure is just an excuse for doing nothing. On the other hand, simple tools which cover the basics can get things moving. Sometimes the momentum carries on, as I have seen more than once with the very simple tools that we provide our clients with.

    Regarding Dale's blog... I think I'm not the only reader who would appreciate your valuable opinion over there, but that's certainly your decision.

    Reply

  • Joe,

    This is a great summary and interesting viewpoints that you all bring to bear here; however I must interject a few points in this particular discussion. As a matter of disclosure to you, I may post this numerous places, to ensure all parties discussed are adequately informed.

    1.Dale cannot just change Bandolier. No matter how much he would like to, want to and/or thinks is a good or bad idea, he simply is bound to a scope by his customer. He may be able to convince his customer to change, but that is a different story. Hence his option and offer to include a plug-in for Bandolier.

    2.In your discussions with Dale, you both have fallen prey to the old “convergence” issue that is commonly batted about and is now old hat, particularly in the area of definition. For example, you and Dale both have used the word “system” interchangeably. After reading many posts on Bandolier and in Digital Bond's site, it is my understanding that Bandolier is meant to meant to address, a single and individual computing asset, and not a greater networked set of computing assets. It is not a bad thing to have different vocabulary, but we need to set ground rules in communication to ensure adequate and fair dialog.

    3.You are absolutely correct in your theory of addressing the security of “the system,” which in this context I define as “every asset in the place used to make product.” However, in the previous set of paragraphs, you indicate that your “...concerns were that I thought the Bandolier approach could cost the end users significant money by having to address non-critical systems.” I can assure you that there is no such thing as a “non-critical system” when dealing with networked control systems. If it is connected by any medium, even via sneaker-net, it is vulnerable to compromise. That said, Bandolier has it's place.

    You included a stand alone statement the I am bound to address: “I believe the purpose of control system cyber security (CIP) is not to secure a computer, but to secure a system and/or facility. The vulnerability of a computer is important only if it leads to a compromise of a process or facility.” While I agree with the first statement, the second in my professional opinion, is very dangerous. We cannot wait for an asset to be compromised to secure that asset or system. Further, we are required to take additional steps to identify and address these risks. This computer vulnerability may not compromise the infrastructure in an of itself, but when that vulnerability is exploited, there is a good chance that it will.

    4.The gold standard issue is important here for a number of reasons. One, NIST 800-53 was never specifically intended to be 100% focused on control systems. 800-53 was developed out of the passing of the E-Government Act of 2002 by the US Congress, specifically Title III, the Federal Information Security Management Act. This is important due to the fact that it addresses only computing assets and Information Technology based infrastructure and not SCADA and Industrial Control System assets. In my opinion, it is not wise for us in the industry to adopt a “gold standard” for security that is based on standard that only addresses a small portion of our infrastructures. NIST 800-82 on the other hand...

    Secondly, the “gold standard” issue is important because a vast majority of customers do not want the “gold standard.” They see that as “gold plating” and want to do the least amount, possible, driven by manpower, skill set and budget, while maintaining major compliance with whatever standard is chosen. We need to walk, before we run.

    Now for the technology discussion. The Information Technology type assets (Firewalls, VPNs, IDS/IPS, etc) that are used in this space are only as effective as we make them to be. If you allow any control traffic through a firewall, you have to accept the consequences of that action. 99% of firewalls today to not do any deep packet inspection on control system type traffic and will either pass or block traffic as a packet filter device that understands state level communications. This is why you cannot let control system traffic through a firewall. A perfect example is the Wonderware Suitelink situation. One could compromise a system (both definitions of system) through that firewall if the firewall allowed Suitelink traffic through it. That is why control traffic stays home.

    VPNs are the same way. The purpose and intent of the VPN is to provide secure, networked based remote access to single point, which provides general and controlled access to a networking infrastructure. Remember, this is an access mechanism, not a perimeter or defense type security mechanism (just like VLANs are an organizational tool, and not a security mechanism). We have to use this technology appropriately and in conjunction with other technologies to provide a secure and more importantly a monitored and watched environment. All the security hardware and software in the world means nothing if you do not constantly and vigilantly monitor and track all traffic (down to the data link layer) in these environments. This leads me to intrusion detection systems/intrusion prevention systems.

    These environments are intended to be very static in nature. The controls engineers deploy these systems, make product for 25 years, then forklift upgrade them. Because this is the case, there should be a very exact, predictable and documented traffic pattern and because of these patterns, it could be rather effective to watch this traffic with a heuristic and signature based IDS/IPS system. Going further into threat analysis, the data indicates that there are some fairly broad categories of failure modes from network based threats (dos, fragmentation, etc). As I indicated before with firewalls, IDS/IPS technologies are still fairly limited in their dissection of control protocols. However, if we increase our knowledge about these failure modes, then we can leverage the IDS/IPS to not only detect but more importantly report these events to the asset owners that a compromise has occurred or is currently occurring. As time progresses, the greater number of these types of events that is captured and logged, allows valuable quantitative risk analysis to be conducted. This may lead to the prediction of when network operations may effect process reliability, etc. At this point, we are no longer dealing with fuzzy "risk" numbers, but instead are focusing with laser type precision on what actually matters how/when/etc systematic faults occur that effect efficiency and reliability.

    The end result for all of us almost the same. Bandolier allows a means to identify and address risk of computing assets in the environment. Joe is right in the fact that more needs to be done. Neither of these two perspectives offers us the ability to truly inspect the performance of these systems. After all, almost every real world example Joe introduced was either a personnel/procedural/organizational breakdown or a performance based compromise/failure of the control system itself. I propose that we need to have a mechanism to watch/alert/filter/ the bits that are manipulating the SCADA and Industrial Control Systems to ensure the safe and secure operation of those systems moving forward. This theory/methodology deployed in conjunction with projects like Bandolier and standards such as NIST-800-82 (whenever it's finished), NIST 800-53, ISA SP-99 and others, allows for the means of addressing the entire infrastructure, not just it's parts.

    Thanks for the forum. Back to lurking... Kind Regards,

    -bhh si vis pacem, para bellum

    Reply

RSS feed for comments on this page | RSS feed for all comments