The quest for continuous improvement in the process industries always requires change. However, when not managed properly, change can lead to disaster. Chemical makers and refiners often use their industrial control system (ICS) — the cyber-physical assets responsible for automated controls and safety — as the platform for continuous improvement. At most sites, the ICS undergoes more changes than any other production asset. Yet, while operating companies around the world for almost three decades have accepted management of change (MOC) as a best practice when altering physical assets such as valves and pumps, many processors have failed to consistently apply the same level of rigor to managing configuration changes to the ICS. Investigations into several major plant accidents by the U.S. Chemical Safety Board and the U.K. Health and Safety Executive have identified improper modifications to ICS alarms, control loops and safety instrumented systems (SISs) as either a major contributing factor or a root cause of the incident.
Meanwhile, the fast-growing threat of cyberattacks initiated by nation states or criminals seeking ransoms has created an urgent need to lock down and protect the ICS configuration. Furthermore, unmanaged change initiated by internal actors — employees and contractors — can lead to the same catastrophic consequences that an external bad actor can impart on a production facility.
Boards of directors today recognize the risks to their process safety, profitability and brand reputation posed by unprotected cyber-physical assets. The good news is that defining and implementing a basic cybersecurity strategy that includes change management of the ICS configuration will go a long way in protecting against cyber vulnerabilities, both external and internal.
A Crucial Element
ICSs comprise field instruments (sensors and actuators), distributed control systems (DCSs), SISs, supervisory control and data acquisition (SCADA) systems, process historians, advanced applications, process analytical systems, and more. An ICS plays a number of key roles:
Repository for Intellectual Property (IP). The ICS is the real-time container of IP, the collective knowledge essential for effective performance and safety. For instance, a DCS may hold details such as the highly proprietary recipe for a polymer product or a complex strategy for controlling the outlet temperature of an ethylene furnace. The configuration of a DCS represents important and highly valuable company IP. Its configuration also includes operational, safety and equipment design operating limits. So, protecting the trade secrets embedded in the configuration of a DCS must be a top concern to a corporation’s general counsel and chief financial officer.
Defender of Safety. The basic process control loop function in a DCS provides protection against process disturbances in real time, preventing a minor upset from becoming a major abnormal situation. The DCS alarm management system notifies the console operator when intervention is required to correct a process or equipment anomaly. The SIS is designed to prevent significant equipment damage as well as catastrophic incidents by detecting unstable and out-of-control conditions and initiating a graceful shutdown of the process. Mechanical relief systems go to work in situations where the SIS has failed to effectively contain an abnormal situation.
Protector of Equipment. Operational, safety and design boundaries configured in the DCS ensure that automated control loops can push the process to its farthest limits without violating critical constraints. The DCS provides this protection automatically, 24/7.
[callToAction]
Platform for Continuous Improvement. The ICS is like a fine bottle of wine: it becomes more valuable as it ages. That’s because control and production engineers are constantly modifying the configuration of the system to enhance controllability, safety, quality and yield. Continuous improvement requires continuous change to the system. It’s not unusual for a control engineer to alter the configuration of a system multiple times a week.
Challenges In Managing Change
Change can deliver improvement only if it’s managed methodically and consistently. Ensuring effective MOC for ICSs requires grappling with a number of difficult issues. These include:
Disparate and multigenerational systems. As a result of plant expansions, acquisitions and modernization projects, a typical process plant today may rely on ten different classes of ICS systems and applications, from five major automation vendors, representing three different generations of technologies.
Highly complex and proprietary structures. ICSs are inherently complex because they contain detailed configuration and logic programs for automatic control of the process. ICSs also are highly proprietary. Each control system type has a unique architecture, communication protocol, hardware and operating system. Interoperability among control systems from different automation vendors is achievable but difficult to implement and maintain due to the proprietary nature of each system. In fact, different generations of control systems from the same automation vendor usually require special gateways to communicate. Proprietary ICSs generally aren’t designed to operate third-party applications. The complexity and challenges of interoperability among ICS devices has created considerable engineering and operational challenges for owner operators.
The human factor. The control system administrator at many process plants also serves as the process control engineer. As such, that person often is the designer, implementer, quality assurance engineer and trainer for improvements to the ICS configuration. Even a seasoned engineer who is deeply trusted by the organization is human and prone to error. Human error is a significant contributing factor in most industrial accidents. Additionally, the number of qualified professionals to manage the ICS continues to decline due to an aging workforce and a diminishing pipeline of qualified engineers. Many companies struggle to attract new college graduates to manage and maintain 30-plus-year-old technology.
ICS cybersecurity. This is one of the greatest risks threatening the industrial sector today. Cybersecurity has attained the same level of priority in the minds of industry executives and board members as safety has over the past three decades. The threat of a cyberattack on an ICS no longer is viewed as a hypothetical possibility — and there’s widespread recognition that potential instigators include far more than a few grungy hackers in some dark basement. Legitimate experts and governmental threat intelligence professionals have traced several attacks on the ICS within critical infrastructure to nation states. Meanwhile, criminals see an opportunity in ransomware attacks such as the one against the city of Atlanta in 2018. Boards of directors now appreciate the cyberthreat to their production assets as a serious risk to safety and profitability. They are demanding plans from CEOs and executive staff to proactively manage that risk.
Lessons From Process Safety
In 1992, the U.S. Congress passed the Occupational Safety and Health Administration (OSHA) Process Safety Management (PSM) 29 Code of Federal Regulation (CFR) 1910.119 for operators and handlers of highly hazardous chemicals. The sixteen articles of the PSM have become a standard compliance practice etched into the culture of most process companies worldwide. What makes this regulation unique is its broad acceptance and ubiquitous practice by the industry. The fact that OSHA regulators sought industry’s safety best practices and incorporated them into the regulation has made 1910.119 the darling of professional safety practitioners. One of the most prominent articles of 1910.119 is Article 8, Management of Change.
Most companies have implemented some level of MOC procedures to avoid incidents caused by mistakes resulting from modifications to the configuration of the ICS. These MOC procedures are designed to prevent inadvertent errors by well-intentioned employees. However, managing change on the ICS has become more difficult because companies now must consider change initiated by two new adversaries: the malicious insider and the third-party bad actor. The consequences of a faulty configuration change to the ICS are the same regardless of the source. Operating companies must take configuration change management seriously and implement work processes to ensure the integrity of the ICS. A robust MOC program for the ICS must underpin any ICS cybersecurity strategy.
Just as MOC has become an integral part of the safety culture in most process companies, it now must become a cohesive component of cyber-physical asset management.
The Path Forward
Industry executives recognize the crucial role of the ICS in process safety and operations. They also have come to understand that ICS vulnerability to cyberattacks is a real and potentially significant business risk that must be managed accordingly. Many leading operating companies have launched corporate-wide programs to systematically manage their exposure to ICS cybersecurity risk. Others currently are developing their strategy to tackle this risk. Addressing the cybersecurity risk is a considerable undertaking.
Success depends upon building a solid foundation in two ways: acting with urgency and engaging the executive team.
The current state of ICS cybersecurity is not unlike where PSM 1910.119 was in 1992. With one exception, 1910.119 was a regulation with a five-year grace period to attain 100% compliance. Cybersecurity has no grace period. It’s a real and immediate business risk but one that’s stealthy and unpredictable. Like a known safety gap left unmitigated, it continues to pose a risk to the business. It demands urgent action.
A successful ICS cybersecurity risk management program starts with the support of the board of directors and the executive management team. In addition, it requires high-level accountability for the overall cybersecurity program and its execution.
Then, with urgency and executive engagement, an ICS cybersecurity program, at a minimum, should focus on the following five critical steps:
1. Know what you have. Treat your ICS cyber-physical assets as the invaluable assets they are to your process safety and production. First, establish an accurate inventory of the ICS assets throughout the enterprise. The inventory must encompass the mix of disparate mission-critical devices that make up the ICS, including operator consoles, process controllers, input/output cards, smart field instruments, programmable logic controllers and other devices. At minimum, capture hardware make and model, operating system and firmware version/revision, software applications and other relevant information for each device. Additionally, include the physical location of each asset. A complete and accurate inventory of the ICS assets is an essential first step toward everything else in the program.
2. Understand your risks. Conduct a risk assessment. Make it a high priority. The purpose of an ICS risk assessment is to identity, classify and prioritize security gaps that can impact the availability and proper functioning of the ICS. Remember that the ICS includes the SIS. If your SIS contains critical vulnerabilities such as bypass of a trip function, it may fail to safely shut down the process under hazardous conditions.
Risk assessment is another area where ICS cybersecurity can benefit from the proven practice of process hazard analysis outlined in PSM 1910.119. The premise here is that not all devices are mission critical and not all vulnerabilities can result in a safety incident or lead to production loss. To that end, giving different classifications to cyber assets based on threat exposure, vulnerability and consequences is the proper approach to assessing risk.
3. Tackle vulnerabilities. Immediately address known vulnerabilities with high impact on critical devices. At the highest level, two categories of vulnerabilities exist: known and unknown. ICS vendors issue patches for known vulnerabilities; operating companies must act with urgency in dealing with any such high-impact vulnerabilities. Obviously, a company can’t address unknown vulnerabilities.
4. Manage change. Attackers take control of the process by making changes to the configuration of the ICS. Generally, ICS hackers have deep knowledge of the targeted system. They also know the environment they are attacking, whether an ethylene furnace or an electric grid. Through social engineering and information readily available on the Internet, they can map their target and act with precision to create unsafe conditions. Types of attacks include manipulating control functions to immediately trip a process or modifying code to disable or bypass a function that is required in the future. Disabling a critical alarm, for instance, denies the operator the notification of an emerging hazardous condition requiring immediate action. In another example, bypassing a trip function works as a ticking bomb in the SIS.
What is needed first is a rigorous MOC policy. Stakeholders such as engineers, operators and technicians must receive training on the policy. Next, the automation system team must capture a baseline of the system. A baseline represents the golden standard against which to evaluate future changes to the system. Changes beyond those the console operator needs to apply to run the process must require adherence to the MOC policy. Changes to control strategies, critical alarm settings and safety systems must begin with documenting the current configuration, proposed changes, what-if-analysis of what can go wrong, approval by appropriate staff, training of operators, and other standard procedures defined in the MOC procedure.
5. Automate backup and recovery. Companies must assume their ICS will be compromised at some point, just as they assume a safety incident will occur despite all the mitigation measures. Incident response is a critical part of an effective ICS security strategy. It’s absolutely critical to maintain an up-to-date version of the system configuration that is no older than the rate of change typically implemented on the device (weekly for assets modified on a day-to-day basis, monthly or bi-monthly for devices altered less frequently). As discussed earlier, change to the control system’s configuration is the vehicle for implementing many continuous process improvement ideas. It’s also the mechanism used by bad actors to hijack an ICS and inflict harm. Automated backup of the system is essential to rapid recovery from a shutdown.
Final Thoughts
The ultimate intention of bad actors attacking critical infrastructure isn’t to move bits of digital information but to move molecules and electrons in a manner that causes physical harm. Documented cyberattacks on critical infrastructure in every case have included configuration changes to the ICS that led to physical harm. It is hard enough to avoid safety incidents initiated due to inadvertent errors by well-intentioned people in the organization. Now, we also must consider changes implemented by those with bad intentions. ICS security is a safety challenge. The effective consequence of a successful cyberattack by bad actors is no different from that of an actual safety incident. Process industries executives must address the ICS cybersecurity challenge in the same manner they have successfully dealt with the ever-present challenge of process safety.
EDDIE HABIBI is the founder and CEO of PAS Global, Houston. Email him at [email protected].