Shutterstock
controls in chemical plant

What Is the Best Way to Manage Transient Operations to Achieve Process Safety? 6 Steps to Follow

Oct. 21, 2024
Transient operations occur when transitioning between operating states and are often the most hazardous periods in a chemical facility.
Chemical Processing recently hosted a webinar on the importance of process safety during transient operations. Normal operations include continuous and batch processes, while abnormal situations involve plant upsets or deviations from typical parameters. Loss of control can lead to unscheduled shutdowns, and emergency situations may require immediate action. For all scenarios, proper instrumentation, risk assessment and decision-making are critical.
 
Trish Kerin, director of the IChemE Safety Centre and co-host of the Process Safety with Trish & Traci podcast series, served as the subject matter expert and emphasized that process safety revolves around leadership, focusing on knowledge, engineering, systems and human factors. 
 
Key issues in transient operations include operating outside the normal envelope, unknown parameters, unrecognized hazards and inadequate risk assessments. To address these challenges, Kerin recommends developing comprehensive procedures for steady-state operations, troubleshooting and abnormal situations. Regular procedural reviews and risk assessments are crucial to ensure safety during transient operations.
 
The ultimate goal is to bridge the gap between "work as imagined" (procedures) and "work as done" (actual practices) through field observations and procedural updates. 
 
The audience for this webinar submitted several questions and Kerin addressed them at the end of the presentation (you can view the webinar here). Here is that Q&A.
 

Q: We already have procedures for unit startup and shutdown. Do we need a procedure for transient operations? How is it different than normal startup and shutdowns? 

 
A: I would say yes. The chemical plant I talked about that I worked in, we had a startup guide, a shutdown guide, an operations guide and a troubleshooting guide. The troubleshooting guide is one that actually helps you navigate through the unexpected because when you are in the situation of potential emergency shutdown or an abnormal shutdown, that might not be covered adequately in your startup or shutdown procedures. 
 
It doesn't necessarily need to be a separate document. For example, think of it as a startup, shutdown and troubleshooting guide -- a singular document. In the troubleshooting section you go into scenarios of: Have you checked this? What to do when this happens. Why is this going on? Focus on this area. So, it's a different checklist, but I do think you need all three styles. That’s why I think it's important to do that risk assessment and understand what your troubleshooting parameters are going to be and then address those because that will be slightly different from that troubleshooting perspective than the startup or shutdown.
 

Q:  Is a PHA (process hazard assessment) sufficient for risk assessment? What other studies gain industry attention?

 
A: PHAs are very good processes to use. They all have their place. So PHA is generally a generic term around the world. There's a range of different risk assessment techniques that are useful at different stages. In fact, a few years ago in the Safety Centre, we published a document called Delta Hazop, which really focuses on assessing how the system has changed. That's what the delta is about, not actually a structured guide word hazop, we just called it hazop because people like the word, but it's all about how the facility has changed both from management of change and from creeping change that occurs. You can have a facility that you've never actively changed and it has a series of pumps in it, but as the pumps wear, the pump moves off its pump curve. You've made certain assumptions that the pump operates in this way and now the pump operates differently.
 
Has that increased the risk of the facility? Has that introduced a hazard that wasn't anticipated because the system was designed and assessed based on everything operating precisely as designed? In a personal example, the tire tread on your car is an example of a creeping change. Over time, the tire tread wears away and the car starts handling differently. And so if you are still driving in the same style, for example, in the snow on worn tires versus brand new tires, then you're going to have some different handling characteristics and some different hazards are going to be introduced. So, there are other sorts of risk assessments that you can do. The Delta Hazop is available as a free download off our Safety Centre website. And that also actually talks about a whole series of other risk assessment methodologies at different stages of the lifecycle. We're also producing an infographic at the moment that we'll talk about different risk assessment methodologies to use at different life cycles to focus on different things. And so that transient operations risk assessment is another example. It's a different method of risk assessment to take you to help you understand what's going on at a different part of your facility at a different time. 
 

Q:  When will that guidance be posted on risk-assessing transient operations?

 
A: You can go to www.icheme.org/knowledge-networks/safety-centre. In our publications area, you can download any of the publications that we develop, that's where we release all of our documentation.
 

Q:  Can you categorize classified transient operations? For example, runaway explosions of overflow?

 
A: They are transient operations, but they are also the operations. I think at the start, I talked about the fact that part of the operating mode in emergency shutdown is a transient aspect. So yes, they are transient because they are constantly changing and deviating from the accepted process that has been through the detailed procedure. They are also an emergency situation, that emergency management activity when you're talking about the nature of those things. So runaway, overflow, etc. I would call them transient. I'd call them emergency transient, though.
 

Q: Is there a database or statistics where the general shutdowns of refineries in the oil and petrochemical sectors are analyzed?

 
A: I'm not aware of that, no. When they've resulted in an incident of significant enough magnitude, there has been a Chemical Safety Board investigation. And so the three incidents I talked about — Texas City, Richmond Refinery and Husky Refinery — all have quite a detailed CSB investigation that walks through that detail. But I'm not aware of anybody logging any general statistics or information on shutdowns or startups. 
 

Q: Can you talk about process dynamics as applied to transient operations? For example, rate of rise?

 
A: I think working with process dynamics is important because that will start to give you what I would call weak signals — or for those of you that have heard me talk about it in terms of the Platypus Philosophy and it's something occurring that is unusual such as a rate of rise that is beyond norm. The normal expectation is a weak signal that something bad is potentially about to happen. So yeah, I think there is certainly a place for process dynamics in understanding that and advanced control systems, looking at what the facility is doing, how it is changing, how it is deviating. I think as we get better and better with understanding data mining and large language models, for example, we've got better opportunity to use some of the artificial intelligence technologies to be able to identify these weak signals potentially earlier than we can as a human.
 
The idea here is that if we can identify, say, an abnormal rise in temperature rates of the rate that it's rising before we get outside the bounds, we can intervene and take action. And so that's where I think there's enormous potential looking at some of the artificial intelligence models to try and flag these things faster than a human can see them. We can take action and avert that incident and pull it back in. We might end up in an abnormal, but we might not end up in an emergency shutdown because we've been able to identify it sooner and take intervention action faster.
 

Q: At more complicated facilities, how do you balance allowing the operator to do their actions to shut down and trying to find the appropriate document to shut down? 

 
A: Having the operators involved in developing the documents to start with. I think one of the challenges as plant engineers is that we sometimes think we know best because we know all the theories about how the plant should operate and how everything should be done, but we're not the actual people out there operating the plant. And so, I think obviously proximity is important. People need to be able to access the procedure that they need quite quickly. The procedure needs to be workable. And so, how do you make sure something is workable? You involve the people who have to work with it and get them involved in developing procedures so that they're developed in such a way that it meets needs, the needs of the engineer who wrote them, the needs of the operator who has to use them. You've got to make sure they're written in the right way for the operators to understand them.
 
And that includes language and language needs. So, if you have a workforce whose first language is not English, then should your procedures be written in a different language potentially so that they can actually read them in a stressful emergency situation? If English is not your first language, you are probably going to want to go back to your native language to process information faster. So, thinking about some of those language literacy, numeracy requirements, I think that's also critically important. And make sure that when you do involve the operators, it's through a structured risk assessment process so that you are identifying all the hazards and making sure you've addressed them at that point.
 

Q:  In a plant shutdown activity, what is the best way to conduct a PSSR knowing that some activities may be completed in different timeframes? Is that better to conduct PSSR after the last activity or to conduct partial PSSRs for individual completed tasks?

 
A: That's a tricky question. I'm not sure there's a one-size-fits-all answer to that, but I think a key point is making sure that whatever you do is well communicated and people understand the schedule and the plan that's been put in place to make sure that you don't inadvertently, if you're doing it piece by piece, there actually needs to be one final overall, making sure all the pieces are there because otherwise, you might be missing a critical one. But everybody's assumed that someone else did it because that wasn't my bit, I did my bit, I can't be accountable for everyone else's bit. So, you need to make sure that you've got clear accountabilities and responsibilities defined and that there's an overall final check process to make sure everything has been done because otherwise you are going to miss something. And it's also important to understand that as leaders in an organization, sometimes it's important as well for the leader to get out there at the final end. Before we start up, do one last final walk around to ensure everything is okay and good. I once was a safety manager in a company — a gas plant. We were coming out of a shutdown and I went for a walk just before we were about to go through the final PSSR, so we hadn't quite done the final checks, but I went for a walk and all of a sudden noticed a flange didn't have a gasket in it.
 
So obviously the startup was delayed somewhat as we then needed to go and do a complete check on every flange that had been broken during that shutdown to make sure that every one of them had a gasket in it. And I think we found nine or 10 flanges without gaskets. Now a gas plant will not hold pressure without a gasket in its flanges. So also, sometimes that extra random check of a walk around is important. That extra set of eyes, just taking another look from a different perspective is critically important as well. So think about maybe formalizing some of that. I also used to authorize in another facility, I used to be one of the authorizers for confined space entries. And every time the permit was issued, even if it was a permit that was being used multiple days in a row every day as a validator for that permit, I would physically walk the isolation line to make sure that nothing had been missed. And one day I walked the isolation line and discovered it was the first isolation for that particular job, but discovered there was actually an isolation missing because we didn't know it existed. We'd missed it completely. And until you get out and walk that line, you don't know. So, structured process, clear roles and responsibilities to make sure that everything that needs to be checked has been adequately checked.
 

Q: Transient operations are stressful and sometimes chaotic times. How do you make effective decisions at this time?

 
A: It starts off with you need to make sure that you've got good guidance to help you in those situations. There are six characteristics developed in terms of decision-making during a crisis that are quite practical to think about. 
 
The first one is to confirm authority and direction. Who's in charge? Make sure everybody knows who is in charge. If everybody thinks they're in charge, everybody's going to be off making decisions and you're going to have more chaos. 
 
Make sure you've established psychological safety so people are safe to put their hands up and say there's something wrong without being yelled at or punished in some way because you need that information from them. 
 
You need to make sure you use not only your explicit but also your tacit knowledge. So, you're going to be relying on procedures and documents and clear, explicit knowledge, but you're also going to rely on people's experience and gut feel because they've been there before. They've seen it before. So you need both explicit and tacit knowledge coming into it. 
 
Manage people's expectations of we can't deal with that right now. We'll deal with that later. I've got to deal with this. Make sure people understand that, no, I'm not dealing with your problem because that's not a higher priority than the bigger issue I've got to deal with. Because, if you don't manage that expectation, they will keep coming at you or they will get frustrated and go off on their own and do something and both of which could cause a problem for you. 
 
Make sure you also keep an eye on your cognitive biases and focus on understanding what they are. So you've got some checks and balances in the background to help you deal with that. And I've spoken on cognitive biases in process safety before.
 
And lastly, when decisions are made write them down — record them because they're forgotten or people won't realize they've been made. Make sure you document and record your decisions and communicate them adequately. And so those six characteristics are critical in terms of good decision-making during a crisis situation.

About the Author

Trish Kerin, Stay Safe columnist | Director, IChemE Safety Centre

Trish Kerin is an award-winning international expert and keynote speaker in process safety and the inaugural director of the IChemE Safety Centre. Trish leverages her years of engineering and varied leadership experience to help organizations improve their process safety outcomes. 

She has represented industry to many government bodies and has sat on the board of the Australian National Offshore Petroleum Safety and Environmental Management Authority. She is a Chartered Engineer, registered Professional Process Safety Engineer, Fellow of IChemE and Engineers Australia. Trish also holds a diploma in OHS, a master of leadership and is a graduate of the Australian Institute of Company Directors. Her recent book "The Platypus Philosophy" helps operators identify weak signals. 

Her expertise has been recognized with the John A Brodie Medal (2015), the Trevor Kletz Merit Award (2018), Women in Safety Network’s Inaugural Leader of the Year (2022) and has been named a Superstar of STEM for 2023-2024 by Science and Technology Australia.

Sponsored Recommendations

Keys to Improving Safety in Chemical Processes (PDF)

Many facilities handle dangerous processes and products on a daily basis. Keeping everything under control demands well-trained people working with the best equipment.

Get Hands-On Training in Emerson's Interactive Plant Environment

Enhance the training experience and increase retention by training hands-on in Emerson's Interactive Plant Environment. Build skills here so you have them where and when it matters...

Rosemount™ 625IR Fixed Gas Detector (Video)

See how Rosemount™ 625IR Fixed Gas Detector helps keep workers safe with ultra-fast response times to detect hydrocarbon gases before they can create dangerous situations.

Micro Motion 4700 Coriolis Configurable Inputs and Outputs Transmitter

The Micro Motion 4700 Coriolis Transmitter offers a compact C1D1 (Zone 1) housing. Bluetooth and Smart Meter Verification are available.