Managers at process plants continually ask the question, "How many alarms can an operator handle?" Because dealing with alarms is a large part of a console operator's activities, this is an important question. A research study covering 37 operator consoles conducted by the Abnormal Situation Management consortium showed a monthly average alarm rate of 2.3 per ten-minute period for normal operations and a median value of 1.77. A study by the U.K. government's Health & Safety Executive reported five alarms per ten-minute period — for 95% of the consoles included in that study, the average monthly peak alarm rate was 31–50 alarms per ten minutes. The Engineering Equipment and Materials Users' Association (EEMUA) published guidelines in 1999 stated that less than one alarm per ten minutes should be the goal but two per ten minutes is manageable and five in ten minutes likely will pose excessive demands on operators. It describes less than ten alarms in the first ten minutes following an upset as manageable and more than 20 "hard to cope with." The International Society for Automation (ISA) largely utilized similar values in the ISA 18.2 standard, with one to two alarms per ten minutes in steady state and less than ten in any ten-minute period.
While plants generally accept the EEMUA/ISA values, are they correct? This was the question the Center for Operator Performance (COP), a Dayton, Ohio, consortium that includes operating companies and automation vendors (www.operatorperformance.org), asked itself. So, the center commissioned Louisiana State University (LSU) to conduct a series of studies over the past two years to answer this question.
TWO STUDIES
The first study involved LSU engineering and construction management students and used five alarm rates on a pipeline simulator. The alarms were presented either by time of actuation with priority color coded or grouped by priority. At one, two, five and ten alarms in ten minutes, there was no difference in response time to handle an alarm. At 20 alarms in ten minutes, response time increased by a statistically significant amount — and the display with alarms grouped by priority yielded statistically significant better response time. So somewhere between ten and 20 alarms in ten minutes response time degraded but not as much when alarms were grouped by priority. At 20 alarms in ten minutes, the students achieved better response times for higher priority alarms by sacrificing time on low priority ones.
The limitations of this study are obvious. It used students. The alarm rates were for only ten-minute periods of time. The alarms were evenly distributed across priorities, so there were as many high priority alarms as medium or low priority ones. However, the implications were so significant they prompted a second phase to address some of these limitations.
This involved exposing actual refinery operators and pipeline controllers to alarm rates for 60 minutes. Because the previous study didn't see any effect before 10 alarms in ten minutes, higher rates were used: 15, 20, 25 and 30 alarms per ten minutes. The alarms were distributed as suggested by EEMUA and ISA: 5% high, 15% medium and 80% low priority.
The results contained a number of surprises. The professionals averaged between 19 seconds per alarm at the lowest rate to almost 26 seconds per alarm at the highest. At 30 alarms per ten minutes, a queue of alarms began to develop because the operators had to spend more time assessing new alarms. They responded in about the same amount of time to high and medium alarms by not spending as much time on lower priority ones.
It may seem obvious that fewer alarms presented to an operator will result in quicker resolution of abnormal situations. However, you must consider the operator's whole workload in any assessment. After all, operators do more than just respond to alarms. Any published targets for alarm rates should specify what's assumed to be the workload for the other tasks; alarm rates in isolation have limited meaning.
However, setting the other tasks (monitoring, control, administrative, communication, etc.) to zero establishes the upper limit of console workload based upon alarm response only. This research is doing that. So for an upset, when dealing with alarms is likely the major component of workload, 30 alarms per ten minutes will result in low priority alarms backing up. Of course, making simple changes to the way alarms are displayed also impacts the operator's reaction time. Even within the alarm workload variable, you can make improvements that reduce overall operator workload.
The second study failed to yield the degree of improvement seen in the first one when the display grouped alarms by priority (category). While 80% of the professional operators and controllers preferred such grouping, the performance benefit from it wasn't as great as for the students. Perhaps this is because they are professionals. However, the two studies had different alarm priority distributions — low, medium and high evenly distributed in the first and the ISA/EEMUA distribution in the second. The professional operators faced fewer of the highest priority alarms at an alarm rate of 30 alarms in ten minutes than students saw at an alarm rate of ten alarms in ten minutes. So the effect of the categorical alarm display may not appear unless there are high rates of high priority alarms.
THE VALUE OF EXPERIENCE
Something extra came from the second study by having the professionals repeat a part of the first study — handling ten and 20 alarms in ten minutes with the same simulation used by the students. The two groups performed about the same at the ten-alarms-per-ten-minutes rate. However, the professionals were about twice as fast as the students for the higher rate condition.
So what is the value of a trained operator? When it comes to alarms, an experienced person can perform twice as fast as a novice during upset conditions. However, that value/ability won't come out when things are calm — the veterans and rookies will look about equal.
This research raises two questions:
1. Are the ISA and EEMUA targets correct? From the research conducted by COP, you can view the ISA and EEMUA numbers as conservative. If a plant achieves them, operators should be able to manage the alarm workload.
2. Could an operator handle more alarms? The COP studies indicate that experienced operators probably could deal with an increased number of alarms; however, the studies don't consider all factors that affect an operator's workload. The research also seems to show there's a breakpoint where operators can't successfully handle the alarm workload safely. The actual breakpoint likely depends upon several variables, including: the rate of alarms, how alarms are displayed to operators, and additional operator workload.
COP's initial research begins to provide a scientific basis for establishing realistic alarm rates for operators. The center also will investigate other variables that can affect how operators handle alarms. Next on the agenda is to explore how alarm presentation impacts performance.
Stay tuned!
DAVID A. STROBHAR, PE, is principal human factors engineer for Beville Engineering, Inc., Dayton, Ohio, and is heavily involved in the activities of the
Center for Operator Performance. CRAIG M. HARVEY, Ph.D., PE, is an associate professor and interim chair of the construction management and industrial engineering dept. of Louisiana State University, Baton Rouge, La. E-mail them at [email protected] and [email protected].