6 Process Safety Pearls of Wisdom

Podcast: 6 Process Safety Pearls of Wisdom

Jan. 7, 2025
We’ve extracted a few gems from the 2024 season of Process Safety with Trish & Traci. Listen in as we discuss recurring accidents, emergency systems and organizational learning.

Welcome to Process Safety with Trish and Traci -- the podcast that aims to share insights from past incidents to help avoid future events. Please subscribe to this free podcast on your favorite platform so you can continue learning with Trish & me in this series.

I’m Traci Purdum, editor-in-chief of Chemical Processing.

To Kick off the new year, let’s listen to the sage advice that Trish Kerin, my podcast partner, gave us last year regarding lessons learned from many incidents.

As we were discussing the 40th anniversary of the Mexico City LPG disaster, where a pipe burst at a Pemex facility in Mexico and caused a massive explosion equivalent to five Hiroshima bombs resulting in 500 deaths, I asked Trish why we still make the same process safety mistakes. Trish offered this insight.

Why Do We Make the Same Process Safety Mistakes?

It comes back to, in my opinion, I think an issue of organizations failing to learn. Now, the great Trevor Kletz once said, "Organizations have no memory. Only people have memory, and when they move on, the accidents reoccur." And so it's not about apportioning blame to companies or individuals. In fact, it's actually about thinking about how we actually ensure that we can take the lessons that we see on a regular basis because there are no new lessons. We do just keep seeing repeated types of events again and again and truly create that learning in the organization. Because I'm not sure that we've really yet figured out how to create a learning organization anywhere because it always does come down to the individuals.

And as individuals move on, they take with them their experience and we bring in newer people that don't have the same experience. And so this is why we need to make sure we have excellent knowledge management systems in organizations where the history is kept, that we have documented basis of safety for the design of our facilities. We understand why designs were made in a certain way, what the meaning of it was, what the purpose of it was, so that when we do a management of change, again, a structured process, that we can adequately consider what the current hazard is and what change that hazard occurs, if any, when I make this change I'm proposing and then make sure we document it and then make sure we train people in it and then make sure all of our systems are updated to account for it as well.

So it's a very broad question. It's not about any one company, it's about how humans learn and the fact that we can have an experience, but that actually doesn't mean we've learned. It means we've had an experience. Until we take the time to go back and reflect on that experience, we'll never truly realize the learning from it. And so making sure that we do reflect as individuals, as companies. And making sure that we document adequately everything that has occurred, how it's occurred, that we thoroughly investigate, that we do put in place corrective actions to prevent reoccurrence, and that we also understand why things work when they work. This is not just about why things don't work. We can actually also learn a lot from why things do work and how can we improve what we do.

We need to continuously improve as well. So it's not a clear-cut answer. I think to a certain extent as humans, we have some fallibility that we need to understand so that we can adequately manage them. We can't get rid of them. We are human and actually being human and being fallible makes us unique and it actually makes us very special creatures because we do have the ability to learn from these things, but we need to make sure that as organizations, we help them learn through that whole ecosystem that we're part of.

Out of Disaster Comes Change

The Phillips Petroleum Complex explosion in Pasadena, Texas, on Oct. 23, 1989, was a tragic incident that killed 23 people and injured 314 others. This event led to significant improvements in process safety, including the development of OSHA's Process Safety Management standard and influenced the EPA's Risk Management Program.

The Mary Kay O'Connor Process Safety Center at Texas A&M University, founded by Michael O'Connor in memory of his wife, who died in the explosion, plays a crucial role in advancing process safety education and research.

One of the key lessons learned was about dedicated fire protection systems. Here, Trish talks about how industry standards have evolved in this area.

Trish: Yeah, the ability of having our dedicated fire systems is incredibly important, because without them, we are reliant on people being able to get close enough to something to take action, and that just may not be a possibility.

And that's something that we need to think about, because if we don't have our firefighting systems available to us, how can we possibly respond? And if we can't actually get to the particular site that we need to get to, then we've got no way of effectively controlling this.

So having things like fixed sprinkler systems to be able to knock down a vapor cloud and contain the vapor cloud within a sprinkler water fog can prevent ignition occurring. Those sorts of systems can actually prevent the explosion happening at all.

It's interesting also, just I'm going to throw in a reference to another incident here. So in 1991 in Melbourne, in the city I live in, there was actually an industrial fire at a chemical facility, and there was a frangible roof on one of the tanks, and that frangible roof flew off. And the fire main was exposed around the facility, and the fire main was actually damaged by debris from the tank roof. And so that facility lost effectively its useful firefighting because the ring main had been sheared and could no longer provide fire water.

So again, we need to have our fixed firefighting systems, but we've got to have them installed so that they're going to survive an incident and then be able to be used to respond as well.

Some of the other aspects around the use of making sure that their fire pumps could be operated. Do you have enough fuel to operate your fire pumps? In this particular incident in Pasadena, I think it was two of the three, or one of the fire pumps of the three fire pumps had been taken out of service and one of them ran out of fuel in an hour. So they effectively then had no way to get water around because their electrical systems had been destroyed, their cabling had been destroyed by the initial fire as well. They couldn't resort to other pumping means either.

They certainly had the instance as well where some of their fire hydrants were sheared off at the ground level in the blast. So again, making your fire system unusable. So how are you not only installing your fixed fire protection, how are you ensuring its survivability in an incident so it's going to be there when you need it?

And I think we talk about survivability a lot in the offshore sector of the equipment to make sure it's available when we need it, but it equally applies onshore. 

Emergency Operating Procedures

The 1994 Texaco Refinery explosion had a lot of things that contributed to this incident including emergency response and spill control. Listen in as Trish explains what lessons were learned from this event.

Trish: Yeah. So, this one really was around looking at their emergency operating procedures and how people were trained to respond to that, and the fact that the refinery was under a significant emergency response situation yet kept one of its units operating. So everything else was shut down, but they kept one of the units operating, which meant they were actually in a situation where they were trying to manage an emergency and also then trying to manage a steady state operation and keeping it in parameters and keeping it safe, and that effectively were two very, very different activities going on at the same time that they were not equipped to be able to deal with.

And in most instances, most people wouldn't be equipped to be able to deal with those things happening, quite frankly. When you're in an emergency situation, you actually need to focus on the emergency, and that, I think, was how they ended up potentially getting to the point where they didn't have focus on what was happening in the FCC, and that then resulted in a more significant issue occurring because they were too busy responding to the significant emergency they already had in their plant facility.

Making sure you've got really clear emergency response procedures that deal with what you're going to do, what's going to be kept running, and what's not going to be kept running, there may well be some units that you do need to keep running for various reasons. In some instances, it's more hazardous to shut something down than it is to keep it running, but you need some really good emergency and troubleshooting procedures to deal with some of those scenarios as well. We don't just need operating procedures that tell us what to do when everything's working right.

We actually also need operating procedures that tell us what to do when we are in an upset situation, and we need to do some troubleshooting because that's the unusual time, that's the time that we're not always experienced with. And that's why if we parallel to the airline industry when we actually look at how pilots are trained and the things that they do, the moment something unusual happens, there's a checklist for it, and they go through that troubleshooting checklist to determine what action they need to take, making sure that we clearly understand those emergency response protocols is very important.

From a spill control perspective, obviously, there was a lot of firewater to manage, and that's always an important aspect. There was not only hydrocarbon, there was firewater, there was foam, and all of those need to be managed from an environmental perspective as well.

Corrosion Control and Equipment Monitoring

Corrosion and mechanical integrity were the focus of the 2019 Philadelphia Energy Solutions Refinery fire and explosion that sent a 38,000 pound piece of shrapnel soaring through the sky. Trish offers her insight into this incident and the importance of monitoring equipment.

Trish: Yeah, so that's all around how the corrosion occurred on that pipe elbow. This is a very complex form of corrosion that can occur. The challenge in understanding this is certain types of carbon steel work really well with hydrofluoric acid, because it builds up a layer on the steel that protects it from corrosion. So, you think, "Okay, carbon steel is good for hydrofluoric acid." But if that carbon steel contains high levels of impurities of things such as copper or zinc, I think it was zinc, then the issue you've got is that coating doesn't build up, and you can get a somewhat accelerated corrosion occurring in that carbon steel.

And at that particular elbow, that point was an older piece of pipework in the refinery that was not being adequately monitored for its corrosion because if it had have been, they would've seen it was spinning faster than they anticipated. It was found that it did have a higher level of copper in it, a high level of those impurities. So basically, not monitoring the most significant parts of a plant where corrosion is likely to occur means that we don't see it happening ahead of time, so we can intervene and replace sections when needed.

And this is a classic. We often do a lot of thickness measuring on pipes, but are we doing it in the right spot? I was walking through a facility recently, and there was a whole lot of thickness measuring on the pipe, and it all looked great. Then, there was an elbow, and the elbow was insulated. It had cladding on it. My question was, "When was that tested for thickness? That elbow at that point, because of the product flow and because of the nature of your substance, is where your thinning is. It's not in the straight wall pipe that you've been doing all the testing on, but you haven't taken the cladding off. So I can only assume you haven't tested that piece of pipe." We've got to make sure we have robust processes to test in the most likely areas of corrosion to make sure that we're checking those so that we can take preventative action to replace pipe work as needed.

Birthplace of Management of Change

The Flixborough disaster in 1974 resulted in 28 fatalities and significant damage. This landmark incident highlighted the importance of multidisciplinary collaboration and continues to serve as a lesson in process safety. Trish speaks to the impact of the lessons still being learned 50 years later.

So Flixborough is the incident that we credit with the creation of management of change, the focus whereby we do a detailed assessment of a change that we're about to make, whether it's a permanent or a temporary change. So in this instance at Flixborough, it was a temporary change that always intended to put Reactor 5 back. So now we have a management of change process that we use throughout the world. We still don't always get it right, but we at least have a process that we can understand and follow, where we have to identify what the change is, consult with subject matter experts, and do appropriate risk assessments to make sure that the change that we are doing is safe, it's not introducing any additional hazard, or if it is, we are managing that hazard to an appropriate level, and then also making sure if it is a temporary change, that we take it back safely again. So certainly management of change has been a significant development out of Flixborough.

It also caused us to focus more on asset integrity and understanding why the leaks occurred, understanding the importance of primary containment a little bit more, a bit focus on that. Competence and capability management, the fact that we do need to understand that different disciplines contribute different things to safe design and operation of a plant. So it's not just chemical engineers, or just mechanical engineers, or just electrical engineers, or even just civil engineers. We actually need all of us working together to deliver the safer outcomes for our facilities. And another really important lesson that came out of Flixborough was the placing of control rooms. Now, sadly, we've seen this happen in a number of other incidents around the world. If we think of Texas City refinery, where there was the demountable trailer where people were gathered when the isomerate tower was being refilled and restarted.

So the focus that we now try and remove people and control rooms from in the middle of the process, and if we can't remove them from that, then we actually armor and strengthen those control rooms so they become a safe haven in the event of an explosion. I actually had the privilege many years ago to work in a facility that was the first major facility constructed by the company ICI at the time, following the Flixborough incident that had occurred, and that facility was built in the 1970s in Australia, and it had a blastproof control building.

Lessons From Boeing’s Mistakes

Trish and I often discuss lessons learned from the aviation industry. In one episode we unpack what happened with the Boeing 737 MAX crisis and what can be learned and applied to chemical facilities.

Trish: Yeah, I think there's really, for me, I think three key lessons in this. The first one is management of change. You have to get your management of change right, you have to focus on it. You have to identify what a change is and make sure you understand all the risks associated with it.

The next one is to not be lulled into that area of focusing on profitability over everything else, especially when the work you do has the potential to kill people. You need to make sure you focus on that engineering excellence part, which historically Boeing was known for its engineering excellence.

And the third part is identification of the weak signals. One of my favorite topics talking about weak signals, and the reason I talk about it here is in the flight before the Lion Air crash, that aircraft was flying and the MCAS actually initiated again, or the first time, in fact, on the flight before Lion Air crashed in that very plane. And on that particular flight, there happened to be a third pilot hitching a ride sitting in the cockpit with the other two pilots. So, while the two pilots flying the plane were trying to figure out what was going on, the third pilot looked and saw the stabilizer dial moving and went, "Oh, the stabilizer's doing something, we need to stop that." It was a weak signal. They intervened, they were able to switch off whatever was going on and fly the plane.

Now it was repaired and was reported and it was said that this issue was there and it's all been sorted. Then the aircraft went for its next flight, and sadly there were only two pilots in the cockpit that day. There wasn't that third person who shouldn't have even been there to notice that this dial was moving because no one expected it to move so they weren't looking at that. They were looking for other things at the time. Finding those weak signals, and you wonder whether there could have been a different outcome in general had that been identified clearly as a weak signal and not just a false alarm that was triggering something at the time, before the first Lion Air crash, before the Ethiopian Air crash?

Unfortunate events happen all over the world and we will be here to discuss and learn from them. Subscribe to this free podcast so you can stay on top of best practices – you can also visit us at chemicalprocessing.com for more tools and resources aimed at helping you run efficient and safe facilities. On behalf of Trish, I’m Traci and this is Process Safety with Trish & Traci    

 

About the Author

Traci Purdum | Editor-in-Chief

Traci Purdum, an award-winning business journalist with extensive experience covering manufacturing and management issues, is a graduate of the Kent State University School of Journalism and Mass Communication, Kent, Ohio, and an alumnus of the Wharton Seminar for Business Journalists, Wharton School of Business, University of Pennsylvania, Philadelphia.

Sponsored Recommendations

Keys to Improving Safety in Chemical Processes (PDF)

Many facilities handle dangerous processes and products on a daily basis. Keeping everything under control demands well-trained people working with the best equipment.

Get Hands-On Training in Emerson's Interactive Plant Environment

Enhance the training experience and increase retention by training hands-on in Emerson's Interactive Plant Environment. Build skills here so you have them where and when it matters...

Rosemount™ 625IR Fixed Gas Detector (Video)

See how Rosemount™ 625IR Fixed Gas Detector helps keep workers safe with ultra-fast response times to detect hydrocarbon gases before they can create dangerous situations.

Micro Motion 4700 Coriolis Configurable Inputs and Outputs Transmitter

The Micro Motion 4700 Coriolis Transmitter offers a compact C1D1 (Zone 1) housing. Bluetooth and Smart Meter Verification are available.