Inspectioneering
Blog

Let's Be Frank: How Do We Measure Up?

By Inspector Frank. June 30, 2021
4 Likes

Editor’s Note:  Writing under the pseudonym Inspector Frank, the author of this column offers a first-hand, candid view of what he has witnessed throughout his career. His purpose in sharing these experiences and opinions is to encourage readers to think deeper about what they do, why they do it, and the possible impact of their decisions.

Inspectioneering is committed to protecting the anonymity of pseudonymous authors. We do, however, hold these contributors to the same editorial standards as those writing under their own name. In this, we know the author’s identity and maintain communications regarding the author’s published works. If you have any questions, feedback, or concerns stemming from this article, please send an email to befrank@inspectioneering.com and we will forward your correspondence to the appropriate party.

Introduction

We live in a world of metrics and Key Performance Indicators (KPIs). This isn’t a bad thing; in fact, it is vitally important. Good Process Safety Management (PSM) requires accurate and complete metrics in order to ensure the systems are working the way they should. But you have to be careful with the KPIs/metrics you select as a company because what I have found over the years is that people will ensure they are hitting those targets.

What do I mean by that?

People will typically perform to what they are measured against. So, when setting up metrics you had better make sure they aren’t simply the “easy to measure” metrics, but rather the ones that are actually measuring for the effect you want. The easiest program metrics to set up and show people are straight quantitative measurements. For example, we have 20 tasks to do a month and we completed all 20 this month, as opposed to only 10 of 20 last month. Quantitative metrics like this are ok, but they speak nothing to quality or effectiveness. Maybe the 10 tasks done last month were much more effective at producing a desirable outcome.

As an aside, KPIs are generally defined as measurable values that show you how effective you are at achieving business objectives. Metrics are simply tracking the status of a specific business process. In short, KPIs track whether you hit business objectives/targets, and metrics track processes. In a lot of the places I have worked this doesn’t seem to be well-understood and the terms metrics and KPIs are used pretty much interchangeably.

In this article, I’m going to discuss a couple of examples of how the simple act of holding people accountable to recorded metrics can have some unwanted side effects. For ease, let’s use a safety example.

Example #1

Every industrial facility I have ever worked at uses Total Recordable Injury Frequency (TRIF) as a lagging indicator of safety performance. A lagging indicator is something that indicates performance after it has occurred, as opposed to leading indicators, which can be measured to predict potential outcomes, whether they are favorable or not.

So, for measuring safety performance, a leading indicator might be “are all onsite staff members up to date with the required safety training?” If safety training compliance starts slipping, that could be a leading indicator that the safety culture of the organization is starting to slip and therefore more injuries could be the outcome.

But back to safety metrics.

In this first case, the TRIF (lagging indicator) at a processing facility started jumping up unexpectedly. In a 1-year period, the injury rates went higher than had ever been seen before. Some of this could be attributed to a change in how injuries/incidents were reported, but injuries of workers at the facility did increase.

This obviously got upper management concerned and wanting to lower the number of injuries. In and of itself, the noble pursuit of reducing injuries should have numerous beneficial effects. However, the way it was approached in this case had some interesting consequences.

The decision was made to get all management, regardless of role, out into the plant more doing safety audits and observations. The final version of this was a “no names, no punishment” style of safety observation/audit that I am sure most of you have seen. It became everyone’s goal in management that year to complete at least one field audit per month. In fact, Superintendents were openly praised in meetings when their personnel or department exceeded their targets.

That drove department managers into pushing their management personnel; if the minimum target was one per month and people are looking, they wanted their area to shine. Some workgroups started pushing for two safety observations per month or in some cases one observation per week (or more).

Sounds reasonable, right?

Unfortunately, this approach to reducing injuries not only failed to achieve the results management wanted but had some unintended consequences. Listed below are a couple of things that went wrong.

  1. For starters, now that the plant had engineers, integrity and reliability specialists, maintenance shop supervisors, etc. all outperforming safety audits, there was less time being spent on their own core tasks and duties. Because these metrics could impact everyone’s formal reviews and financial reward, they were often prioritized at the cost of performing other core responsibilities.
  2. As time went on, individuals started “gaming the system.” Everyone knew they were now being asked to do safety observations, but did not always have the time to do them. This led to many safety observations being done in the quickest, easiest way possible. So instead of an engineer going out into the plant and watching a job with a high potential for personal injury (e.g., swapping out a high-pressure check valve at elevation using cranes), the engineer would go do a safety observation on the janitor down the hall from his office or on another engineer doing work at a computer. Doing things like this allowed someone to get in a safety observation in 15 minutes, rather than spending a whole morning or day on it. This obviously wasn’t the intent.

While these metrics were intended to improve the safety culture at the facility, they ended up being nothing more than a measurement of how many observations were done, not of how effective those observations were. As a result, the TRIF rate did not go down with this program. The TRIF rate did however begin to decline when changes were implemented in how work was planned, laid out, and executed in the plant. Once senior management realized this, the whole safety observation program was dropped.

The question is then, was this program and the associated metrics worth the effort? As it was done, no. This was a very demoralizing outcome, as the leadership team spent time to set this program up, and the training that was required and systems that were built to measure the work all became for naught. It obviously was very frustrating to everyone at the plant. Many personnel felt they had spent a lot of time on what they saw at the end as wasted effort.

It didn’t need to be this way though. The safety observations could have been more effective if they were targeted at work that was known to have a high-risk potential. These jobs could have then been assigned to the appropriate management groups as targets for their observations. For example, let’s say the plant’s highest injury rate comes from hands caught in pinch points. Work that has a higher risk of creating pinch points, such as valve replacement or compressor work, would be prioritized for safety observations. Then a workgroup could be assigned to perform observations on a designated percentage of all valve replacement jobs and measured against whether they were doing those observations or not.

Instead, it became a straight quantitative numbers game of how many observations got done.

It would have taken more effort to set up a targeted safety observation program and then measured performance against that plan, but the results would likely have been much more valuable to the overarching goal of reducing the number of onsite injuries.

Example #2

Let’s take a look at another case where metrics had an outcome that was not anticipated. It started with a question from senior management: “Is the amount of money spent on the piping corrosion monitoring program worth it?” This was quickly followed by a directive to reduce unnecessary Corrosion Monitoring Locations (CMLs), while also not putting the facility into a greater risk for loss of containment.

The Superintendent in charge of equipment integrity decided to set up metrics to measure each of the zone inspector’s ability to reduce unneeded CMLs in their assigned plants in the coming year. Seems pretty straightforward, right? But what is an unnecessary corrosion monitoring location? In this particular case, it was never clearly defined.

Nevertheless, a deadline was set and during the kickoff meeting the Superintendent said that this cost-saving effort was much needed and that those who did well would see it reflected in their annual bonuses. It was stated that reductions in total CMLs of 20% should be everyone’s goal and that the timeliness of review was critical. The percentage was based on how much money senior management decided needed to be shaved off the cost of monitoring piping.

The zone inspectors went back to work with this additional task, while none of their existing workload was reduced, nor were any other initiatives put on hold to provide time to do this review properly. Every month the zone inspectors had to provide the percentage of piping circuits reviewed and the total percentage of CMLs reduced. In their simplest form, the metrics for each plant looked like this:

Plant XX piping circuits reviewed: 20 of 300 7% complete
CML reduction to date: 80 of 6000 (total for plant) 1.3% reduction


These could then be summarized and provided as a straight quantitative measurement to senior management by the Superintendent.

At the monthly department meetings, those who were moving quickly through the reviews and showing a high percentage of removed CMLs were praised. Messages came down from senior management that everyone was doing a great job.

What was worrisome about this is that it was unclear whether the right CMLs were being deactivated. Each of the zone inspectors were using good technical judgment, but as the year progressed there was more pressure to finish this exercise in cost savings. Shortcuts started getting taken to hit the one year deadline for the arbitrarily chosen 20% reduction target.

Were there better ways to remove ineffective CMLs and reduce costs? For sure, but it would have taken more time to set up and probably more time to review and analyze the circuits. In the end, the net outcome was a 20% reduction in CMLs for the facility, but with no way of knowing exactly what drove the removal or what potential impact to process safety it may have caused.

“The more a quantitative metric is visible and used to make important decisions, the more it will be gamed—which will distort and corrupt the exact processes it was meant to monitor.”
— An adaptation of Campbell’s Law

Conclusion

In conclusion, I would like to provide a quick summary of what I have learned from poorly applied metrics over the course of my career:

  1. Metrics are critical to manage. However, like any management tool, they need to be thoroughly planned upfront and the potential effects need to be well-thought-out. Metrics and the systems generating them need to be analyzed and reviewed in detail throughout their life to ensure undesired outcomes are not occurring as a result. This takes significant time and effort on the part of the managers overseeing the work.
  2. Don’t let metrics control you or your team.
  3. Metrics should never be a substitute for logical thinking or sound judgment, nor should they be used as an excuse.
  4. The more quantitative and tied to performance rewards a metric is, the easier it is for people to “game the system.” From my experience, this usually tends to corrupt the processes the metrics were meant to measure.
  5. If the effect of the metrics being measured isn’t the desired outcome, don’t be scared to gut the system in order to make it effective.

“You must have the confidence to override people with more credentials than you whose cognition is impaired by incentive-caused bias or some similar psychological force that is obviously present. But there are also cases where you have to recognize that you have no wisdom to add— and that your best course is to trust some expert.”

— Charlie Munger (Vice Chairman of Berkshire Hathaway)

Stay up to date with all things mechanical integrity and inspection.

Every Monday, we send out a newsletter containing the latest Inspectioneering articles, blog posts, industry news and events, and more.

Sign up below to start getting the newsletter in your inbox.


Comments and Discussion

There are no comments yet.

Add a Comment

Please log in or register to participate in comments and discussions.


Inspectioneering Journal

Explore over 20 years of articles written by our team of subject matter experts.

Company Directory

Find relevant products, services, and technologies.

Training Solutions

Improve your skills in key mechanical integrity subjects.

Case Studies

Learn from the experience of others in the industry.

Integripedia

Inspectioneering's index of mechanical integrity topics – built by you.

Industry News

Stay up-to-date with the latest inspection and asset integrity management news.

Blog

Read short articles and insights authored by industry experts.

Expert Interviews

Inspectioneering's archive of interviews with industry subject matter experts.

Event Calendar

Find upcoming conferences, training sessions, online events, and more.

Downloads

Downloadable eBooks, Asset Intelligence Reports, checklists, white papers, and more.

Videos & Webinars

Watch educational and informative videos directly related to your profession.

Acronyms

Commonly used asset integrity management and inspection acronyms.