Inspectioneering
Inspectioneering Journal

Achieve Comprehensive Reliability by Combining RBI and RCM

By Walt Sanford, President at Pinnacle. This article appears in the March/April 2015 issue of Inspectioneering Journal.
21 Likes

Introduction

On a global scale, managers of refineries, chemical plants, and other industrial facilities constantly seek to develop and implement effective, efficient, and reliable mechanical integrity programs and Risk-Based Inspection (RBI) programs. These programs are crucial to meeting regulatory requirements and ensuring integrity improvement. Today, many managers are finding that they also can address the reliability of all types of assets by combining RBI and Reliability Centered Maintenance (RCM) processes together into one comprehensive reliability management process.

A Common Foundation

Managers of asset-intensive processes and systems are tasked with designing, installing, operating, and maintaining equipment. As such, they often utilize a myriad of tools and technologies to help make decisions on how best to meet the asset owner’s investment requirements while maintaining a safe and regulatory compliant operation. RBI is commonly used to help assure the integrity of operating assets so the assets can reliably and safely meet lifecycle objectives. Other processes include RCM, Process Hazards Analysis (PHA), Reliability, Availability and Maintainability (RAM) analysis, and Root Cause Analysis (RCA).

Fundamentally, all of these processes have risk identification and mitigation at their core. Yet, more often than not, they are conducted as separate efforts, often based on slightly different risk criteria, and often require significant input and support from the same members of the organization. These factors can lead to significant inefficiencies in resource utilization, as well as unclear work priorities, redundant activities, and organizational operations in silos with poor alignment of goals and objectives.

The most efficient and effective organizations are those that recognize the common fundamentals of the various tools and processes, the necessary specialization within their application, and the efficiencies gained from their integration into a common, comprehensive reliability management program. The challenge is to focus on the fundamental aspects of typical risk-management tools and how they can be harmonized within a common risk approach to achieve effective and efficient life-cycle reliability and integrity.

Risk Based Inspection

The primary mission of each facility’s inspection department is to monitor the condition of facility assets and ensure that each asset is suitable for its intended service. Historically, the type or technique of the inspections performed, frequency of inspection, and the extent or location of inspection has been driven by prescriptive codes such as API 510, the National Board Inspection Code (NBIC), or other industry or jurisdictional documents. These codes and standards did not allow for much variation in inspection programs.

In 1992, the Process Safety Management (PSM) regulation (29 CFR Section 1910.119(j)) established requirements for managing the mechanical integrity of equipment involved in processing highly hazardous chemicals. With the establishment of RBI and its associated industry codes such as API 580 and NBIC RB 9300, the inspection groups could develop inspection programs that will most effectively utilize their inspection resources while effectively managing the risk associated with equipment operation. As opposed to prescriptive approaches, a facility utilizing RBI applies the most effective techniques for each equipment item and degradation mechanism, determines the appropriate monitoring intervals, and identifies the best possible locations to test for degradation and analyze equipment condition. Overall, an effective RBI program provides the facility with better information regarding the condition of its equipment and the most accurate expectations for its remaining life.

Reliability Centered Maintenance

Much like RBI, RCM is a methodical and logical approach for the creation of a proactive maintenance program. RCM applies consistent, risk-based decision-making processes that eventually focus proactive activities on the mitigation of intolerable risk, and the cost-effective application of tasks to lesser critical equipment where justified.

RCM was created in the late 1960s during the advent of jumbo jets (Boeing 747, DC10, etc.). Absent any changes to the current maintenance practices at that time, the operation of these large aircrafts would result in increased risk as the numbers of passengers per aircraft increased. When the airlines and aircraft manufacturers analyzed the maintenance requirements with the intent to mitigate the increased risk, they concluded that they would likely not be able to economically operate the planes using existing maintenance philosophy.

Therefore, to reduce the risk of having a catastrophic failure, operators had to reduce the probability of critical failures given the new consequence scenario, while eliminating the time and cost of tasks that did not serve the purpose of significantly mitigating intolerable risks. This required a whole new approach to the way maintenance requirements were defined. The result was a methodical analysis of the equipment to determine a proactive maintenance program. This new approach was designed to realize the inherent reliability of system functionality through fully justified tasks that are either necessary or desirable to protect safety and the operating capability of equipment critical to achieving that functionality.

Since then, the RCM process has evolved and currently exists in various manifestations that have been applied across many asset intensive industries. The most significant benefits have been realized within organizations who have implemented RCM in an efficient, comprehensive, and technically disciplined manner consistent with the fundamental tenets of RCM’s origin.

Process Hazards Analysis

Following the establishment of the Occupational Safety and Health Administration (OSHA) in 1970 and subsequent to a series of major accidents, OSHA and the Environmental Protection Agency (EPA) established regulations for industries involving the handling of hazardous material. Today, PHA is a key requirement of the EPA’s Risk Management Program (RMP) rule, 40 CFR Part 68, and OSHA’s PSM standard, 29 CFR 1910.119. These regulations require that PHA address toxic, fire, and explosion hazards resulting from specific chemicals and their possible impacts on employees, the public, and the environment.

PHA is a thorough, orderly, and systematic approach for identifying, evaluating, and controlling the hazards of processes involving highly hazardous chemicals. The approach requires that facility operators perform PHA on all processes covered by the EPA’s RMP rule or OSHA’s PSM standard, and that the selected PHA methodology is appropriate to the complexity of the process and identifies, evaluates, and controls the hazards involved in the process. PSM also requires that the PHA must be updated and validated at least every five years after the completion of the initial PHA, and requires the application of one or more of the following methods to achieve the goals of a PHA:

  • What-if
  • Checklist
  • What-if/checklist
  • Hazard And Operability Study (HAZOP)
  • Failure Mode And Effects Analysis (FMEA)
  • Fault tree analysis
  • An appropriate equivalent methodology

The HAZOP and FMEA methods are generally recognized as the most comprehensive approaches to meeting PHA requirements, and HAZOP is, by far, the most used PHA method. In fact, many facilities that do not fall under PSM rule, either due to process and material specifics or by operating outside of OSHA jurisdiction, still utilize a PSM (typically a HAZOP) to identify, evaluate, and control the hazards of processes involving significant hazards. These requirements are usually driven by facility regional regulations or internal corporate standards and requirements.

Reliability, Availability and Maintainability (RAM) Analysis

Monte Carlo methods are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The modern version of the Monte Carlo method was invented in the late 1940s during work on nuclear weapons projects at the Los Alamos National Laboratory. Monte Carlo, or “stochastic” simulation is at the heart of RAM analysis. RAM modeling was initially used by the US military in the 1960s and 1970s and was picked up by industry in the late 1970s through 1980s. With the advancement of computer processor speed in the 1990s, RAM modeling migrated out of the research labs where the big computers were housed and onto the desktops of commercial industry engineers.

RAM models are used to simulate all of the probable future performance metrics of a given process design. The output is used to quantify the economics or other performance criteria of equipment-related decisions such as redundancy, spare parts, equipment sizing, maintenance practices and policies, and component quality, among others. For new designs, RAM is a powerful tool for evaluating design decisions affecting issues such as:

  • Probability of unplanned events and impacts on lifecycle performance
  • Process buffer sizing and location
  • Unit/equipment redundancy and sizing
  • Process technology
  • Utility requirements
  • Capital or insurance spares requirements for major equipment

Throughout the life-cycle of existing processes and units, RAM can be an invaluable tool for assisting in decisions related to:

  • Maintenance philosophy, scope and timing
  • Obsolescence and end of useful life (repair or replace/upgrade)
  • Impact of actual failures on risk exposure and priorities of repairs
  • Impact of design or process changes to maintenance, operations and control strategies
  • Spare-parts stocking strategy optimization based on actual parts usage and criticality

Root Cause Analysis

The Root Cause Analysis (RCA) method was devised in the 1950s as a formal study following the introduction of Kepner-Tregoe Analysis. RCA was further developed by the National Aeronautics and Space Administration (NASA) due to limitations inherent in previous methods when trying to solve complex problems. RCA is typically used as a reactive method of identifying event causes, and is typically conducted after an event has occurred. It attempts to solve problems by identifying and correcting the root causes of events, rather than simply addressing their symptoms, thus preventing problem recurrence. Proper RCA recognizes that complete prevention of recurrence by one corrective action is not always possible, and several effective measures might be necessary to address a single root cause. Thus, RCA is an iterative process and a key tool for continuous improvement.

Common Basis and Objectives

Most companies utilize one or more of the above tools or technologies to manage the risks associated with operating their facilities. Unfortunately, they are most often implemented and managed through separate efforts and internal organizations, which can lead to significant resource inefficiency and misalignment of decisions.

When analyzing the above processes, operators will recognize that there is a fundamental basis and process that are common to each. When properly and effectively employed, each of these methods should:

  • Align with business goals and objectives
  • Achieve regulatory compliance, and actual safety and environmental responsibility
  • Define the true performance objectives of each plant, unit, process, system, and equipment item in achieving the above
  • Identify the hazards associated with meeting the performance objectives
  • Determine the risks associated with the hazards (equipment, process, human, environments, etc.)
  • Determine the most efficient and effective ways to mitigate intolerable risks
  • Validate, implement, and execute the mitigation tasks
  • Document the entire process in a way that facilitates the continuous assessment of performance, and the continuous improvement of the process throughout the asset life-cycle

When viewed like this, it is clear that each method is fit-for-purpose based on specific asset classes or objectives, but also have common fundamental approaches. Table 1 depicts an assessment of the degree of commonality of the fundamental steps of each method:

Table 1.
Table 1.

To overcome the inefficiencies, misalignment, and redundancies inherent with managing these efforts separately, an organization should try to harmonize the common elements of each, focus the fit-for-purpose elements which need to differ, document them in a common framework and system of record, and assess their effectiveness using a common and aligned set of key performance indicators.

The proper employment of each individual method should result in an effective outcome as it pertains to the specific drivers and objectives of each. With efficiency and alignment as a common goal, a comprehensive approach can be designed and tailored to industry and business specific environments and drivers.

Conclusion

Refineries, chemical plants, and other businesses that desire to achieve highly sustainable performance through the implementation of an efficient, aligned and harmonized reliability process have the opportunity to capitalize on the commonality between typically separate efforts. These can be merged into a comprehensive Business Reliability Process (BRP) that is focused on reliably achieving the full set of business goals and objectives in a way that is sustainable through future life-cycle changes.


Comments and Discussion

There are no comments yet.

Add a Comment

Please log in or register to participate in comments and discussions.


Inspectioneering Journal

Explore over 20 years of articles written by our team of subject matter experts.

Company Directory

Find relevant products, services, and technologies.

Training Solutions

Improve your skills in key mechanical integrity subjects.

Case Studies

Learn from the experience of others in the industry.

Integripedia

Inspectioneering's index of mechanical integrity topics – built by you.

Industry News

Stay up-to-date with the latest inspection and asset integrity management news.

Blog

Read short articles and insights authored by industry experts.

Expert Interviews

Inspectioneering's archive of interviews with industry subject matter experts.

Event Calendar

Find upcoming conferences, training sessions, online events, and more.

Downloads

Downloadable eBooks, Asset Intelligence Reports, checklists, white papers, and more.

Videos & Webinars

Watch educational and informative videos directly related to your profession.

Acronyms

Commonly used asset integrity management and inspection acronyms.