Root Cause Analysis

Explore this topic

Root Cause Analysis (RCA) is a process that uses several problem solving methods to identify the origin of a problem. The analysis answers when, how, and why a problem occurred in the first place and implements an effective solution to ensure the problem doesn’t happen again. Sometimes, the RCA process highlights other areas of improvement in the organization and provides an opportunity to eliminate future problems. There is not one single way to perform root cause analysis successfully. Instead, it encourages creative and critical thinking to discover all possible causes that led to the original problem. Many problem solving tools and strategies can be used in the process. Below is an introduction to the RCA process.

How to Do a Root Cause Analysis

Step 1. Define the Problem

The first step is to clearly define the problem. This includes identifying what is happening as well as identifying the “side effects” of the problem. Side effects can be caused by by one or more underlying problems and thus, should be treated with equal importance until more information is gathered. Once more information is collected, side effects can be prioritized in order of urgency.

Step 2. Collect Data

Collecting and interpreting large amounts of data can be a daunting task and is typically the longest step in the RCA process. Therefore, it’s beneficial to break up data collection into stages. Information should be immediately gathered on where the problem is occurring, how long the problem has been occurring, and the impact the problem has on external factors. The latter determines the significance of the problem. Examples of external factors include personnel safety, environmental safety, equipment, and revenue. The next stage of data collection provides information for analyzing causal relationships. This involves full data acquisition of the history of the equipment or system under investigation, the team members involved, and the organizational strategies employed.

Step 3. Identify Causal Factors

Once the investigators are satisfied with the quantitative and qualitative data gathered, all possible causal factors are considered. The three major types of causes are equipment trouble or failure, human error, and/or flaws in organizational strategy. However, it’s not enough to stop investigation with the most obvious causal factor. Problems are usually caused by one or more of these factors, or lack thereof, and may contribute to the root problem.

The simplest way to organize data and make connections to causal relationships is to create a visual map. Maps clearly identify the root cause(s) and sequence of events that lead to the problem. Additionally, maps may target opportunities for downstream areas of improvement. For example, if the problem defined in Step 1 dealt with the hydrotreating process unit, analysts may determine ways to eliminate problems downstream in the isomerization or alkylation units.

Step 4. Develop and Implement Strategic Solutions

During the final stage of the RCA process, root causes are identified and analysts begin to develop strategic solutions. An effective RCA solution is one that involves all stakeholders and has a positive impact on the reliability of equipment, operational costs, the environment, and even corporate reputation. Furthermore, an effective solution is one that realigns causal factors to the new strategy. Consider an example of a piece of equipment that needs to be decommissioned due to unrepairable corrosion damage. The root cause was determined to be that inspection intervals were not appropriate for the age of the component. The solution may be to replace the equipment and implement a strategy that reconsiders inspection intervals at various stages of an equipment’s lifecycle. Furthermore, engineers, operators, inspectors, maintenance, and management personnel need to be briefed and trained on the new routine inspections strategy.

Solutions are unique to every organization. One widely known strategy that has been integrated into many organizational strategies is Kaizen. Kaizen is more of a philosophy than a strategy, but nevertheless facilitates company-wide participation in continuous improvement. This means that an accumulation of small actions improve workplace activities. If everyone in the workplace participates, the company will experience growth in production and will be able to prevent failures from occurring in the future.

Root Cause Analysis Tools

Below are two common RCA tools used to identify causal relationships (Step 3 Above). However, there are a number of approaches to successfully determine root causes not mentioned here. Other failure analysis tools include the failure mode and effects analysis (FMEA), impact analysis, and risk analysis.

The 5 Whys

The 5 Whys method is a technique used to quickly determine the origin of a problem. The approach begins with a problem and asks why after every answer. As the name implies, why is asked five times in order to dig beneath the surface and reveal the underlying root causes of the problem. The following demonstrates a 5 Why analysis example:

Problem: Refinery “X” had to shut down plant operations.

Why? Toxic fluid leaked into the plant from a pipe.

Why? Corrosion under insulation created a hole in the pipe.

Why? It had gone untreated for a significant amount of time.

Why? Operators didn’t know about the damage.

Why? Operator/inspection personnel didn’t perform routine inspections.

From this example, one can see that there are several areas of opportunity for improvement. If “why” was only asked twice, the solution would focus on corrosion under insulation. However, further questioning reveals that the true cause was lack of routine inspections. The 5 Whys approach is a powerful tool that should be done asking multiple questions to identify multiple root causes.

Cause and Effect Diagrams

A Cause and Effect Diagram, also known as a Fishbone Diagram, is a tool that is commonly used to visualize processes. Sometimes the oil and gas industry refers to these diagrams as C&E Diagrams. The diagram is constructed from a single horizontal line (the “backbone”) that represents the problem. The angled lines intersecting the backbone of the Fishbone Diagram organize several cause-and-effect scenarios. Causes can be generically broken down into six categories: environment, methods, machines, material, people, and measurement. Possible effects are listed under each cause.

Below is a simplified example of how a fishbone diagram would be constructed. In reality, these diagrams can be much more complex and include more than the six categories previously mentioned. In the example, a refinery is analyzing what the root cause is for hydrogen induced cracking in a low-temperature pressure vessel.

Hydrogen Induced Cracking Cause and Effect Diagram

Based on the 5 Whys analysis and the fishbone diagram, one can see that one major root cause was unplanned routine inspections. The fishbone diagram also illustrates that the nondestructive testing (NDT) method used was ineffective. Further investigation determined that advanced ultrasonic testing methods (e.g., phase array ultrasonic testing or time of flight diffraction) would be more reliable methods for detecting and characterizing corrosion. Thus, the solution to the corrosion problem would be develop routine inspection intervals for the pressure vessel and provide training to personnel to set-up and operate advanced NDT methods.

Relevant Links

Topic Tools

Share this Topic

Contribute to Definition

We welcome updates to this Integripedia definition from the Inspectioneering community. Click the link below to submit any recommended changes for Inspectioneering's team of editors to review.

Contribute to Definition
Articles about Root Cause Analysis

Like many processes, root cause analyses are only as good as the information being used. If critical data about the failure mechanism is lost during repairs, then this information will be unknown to the group trying to determine the root cause(s).

Authors: Inspector Frank
January/February 2021 Inspectioneering Journal

This article discusses the header specification, design, fabrication, and inspection processes to identify where opportunities were available to proactively address the vulnerabilities that resulted in the header rupture.

Authors: Peter Tait
July/August 2017 Inspectioneering Journal

Failure analysis of piping that has experienced corrosion damage provides operators with valuable information needed to prevent future failures. Effective processes and procedures are essential when investigating the cause of corrosion on pipelines..

July/August 2017 Inspectioneering Journal

Most plants have pieces of equipment with chronic problems that impact profitability due to the frequency of outages, cost of repairs, and lost production. It is critical that specific actions are taken to identify and eliminate these “Bad...

Authors: Rick Hoffman
Online Article

Offshore platforms pipelines, terminals and downstream facilities, are costly to build, operate and maintain, so it’s imperative that operators keep tight control over the total life-cycle cost of all associated equipment.

Partner Content

Antea delivers highly flexible risk-based asset integrity software with 3D Digital Twin integration to optimize maintenance, reduce risk, and improve mechanical integrity for oil and gas, power generation, and chemical plants and facilities....

September/October 2015 Inspectioneering Journal

If everyone in an industrial setting actively looked for things that were not right or seemed different, or looked at small mistakes as opportunities to prevent larger ones, what would the future look like?

Authors: Virginia Edley
March/April 2015 Inspectioneering Journal

Today, many managers are finding that they can address the reliability of all types of assets by combining RBI and Reliability Centered Maintenance (RCM) processes together into one comprehensive reliability management process.

Authors: Walt Sanford
Online Article

This case study provides a chronological review of several thermal anomalies discovered within a customer's storage tank that was being used to house soy beans. The first signs of a problem within the storage tank were found when thermal imaging of...

March/April 2013 Inspectioneering Journal

It is another day at the plant and as usual, your boss calls and says there is a meeting in the conference room that he wants you to attend regarding a compressor failure. That is all you know, but from experience you know it must be something major...

    Downloads & Resources related to Root Cause Analysis

      Inspectioneering Journal

      Explore over 20 years of articles written by our team of subject matter experts.

      Company Directory

      Find relevant products, services, and technologies.

      Training Solutions

      Improve your skills in key mechanical integrity subjects.

      Case Studies

      Learn from the experience of others in the industry.


      Inspectioneering's index of mechanical integrity topics – built by you.

      Industry News

      Stay up-to-date with the latest inspection and asset integrity management news.


      Read short articles and insights authored by industry experts.

      Expert Interviews

      Inspectioneering's archive of interviews with industry subject matter experts.

      Event Calendar

      Find upcoming conferences, training sessions, online events, and more.


      Downloadable eBooks, Asset Intelligence Reports, checklists, white papers, and more.

      Videos & Webinars

      Watch educational and informative videos directly related to your profession.


      Commonly used asset integrity management and inspection acronyms.