Root cause is defined as the factor responsible for any anomaly or inconsistency that leads to deviation from specified standards. It is the core of the underlying issue. A root cause can also set in motion several cause-and-effect reactions that may or may not be directly related to it but could ultimately impact the end result in production.

Root cause analysis (RCA) collectively denotes the techniques and methodologies used to identify the cause of any deviation or irregularity in a given system. There are a wide range of tools, techniques, and approaches that enable these analyses. While some of these methodologies are adopted to identify and rectify the core of the issue, some are used to offer leverage and support to other RCA activities.

RCA is generally viewed as a reactive measure. However, comprehensive application of RCA techniques can also aid in creating a proactive system that can predict problems before they occur.

The Significance And Need For RCA

Consider a broken limb being treated only with pain killers. There is likely to be relief for 4 to 6 hours, but the underlying condition will remain. Similarly, in the IT and manufacturing world, deviations from optimal production are inevitable, but identifying and rectifying the core of the issue is what will make all the difference. RCA tools help get to the heart of the problem that requires solution. RCA can help avoid recurrence of incidents, as it aims to offer permanent resolutions.

The Beginning Of Root Cause Analysis

Before evolving into the form as we know it today, RCA first made its appearance in the field of engineering. Much like lean manufacturing, the beginnings of RCA can also be attributed to the management philosophy instituted by the Toyota Production System (TPS). One of the key contributions of Sakchi Toyoda along these lines was the formulation of the “5 Whys”. The basic underlying premise here is asking “why” five times consecutively to arrive at the core of the problem.

For instance, let’s assume the internet is down. These would be the 5 Whys you would ask:

  1. Why is the Internet down? Because the DSL light on the router is not stable and is flashing intermittently.
  2. Why isn’t the DSL light stable? Because there could be an issue with either the line coming in from the filter, or with the line coming from the junction box outside. You’ll then want to check the jacks and cables involved.
  3. Why is there an issue with the filter? – Because it could have either gone bad or might have a loose connection – [Try replacing filter or directly connecting the jack.]
  4. Why is it still not working, despite changing the filter or replacing the jack? – Because this could be an issue with the line coming in from the junction box. – [Get someone to check the junction box]
  5. Why is it not working despite everything being fine in the local junction box? – Because there is an outage.

So, typically the 5th why would lead to the core of the issue.

This example is one of the more simplistic forms of RCA, but differing models can vary in their complexity.

In 1986, Motorola developed a new strategy called Six Sigma. This was primarily built around enhancing risk management. It employed specific methods and statistics to outline RCA. Six Sigma also refers to the highest quality attainable which roughly translates to about 3.4 errors or defective products per million.

How To Do A Root Cause Analysis?

There are plenty of tools and techniques that can be used to carry out RCA. But they all conform to a defined set of steps.

Problem Definition

Problem definition is the first stage of an RCA. It is usually done only after there has been an encounter with a specific issue or discrepancy. This level of RCA requires a study from every possible perspective to understand the possible causes, triggers, and other factors that may or may not be directly associated with the specific issue. This can also be executed proactively by looking for possible issues before they arise. The problem definition defines the problem, its immediate and long-term business implications, and the level of criticality it falls under.

Data Collection

Data collection is a fairly straightforward procedure. It involves collecting all possible data related to the issue at hand. This step may require close correspondence with the customer. The frequency of the problem, related events that could have led to the issue, and other factors must be captured as well. Data collection is critical in creating and understanding connections during the analysis.

Problem Identification

Problem identification is the most critical stage in the process of root cause analysis. It is responsible for identifying the underlying cause of the problem. It involves close scrutiny of all the varying factors that are directly or indirectly related to the problem. In certain cases, where there can be more than one cause for a certain issue to occur, all the possibilities are explored and logged. This stage would require the participation of all the stakeholders involved for better clarity and effective recognition of incongruities. This is the stage of the RCA process where tools and techniques like 5 “Whys”, Fishbone, Scatter diagram are used to arrive at the core of the issue.


Once the list of all the possible problems is made, it is important to weigh them based on priority. The aspects that are highly critical need to be attended to first. The rest of them can be resolved in the order of decreasing significance. Prioritization is important because all the problems cannot be tackled at once, and there can be interdependency that needs to be considered while working on one part of the problem. For instance, while working on a program module, the changes made are likely to impact the rest of the program that are linked to it. So, performing top-to-bottom and bottom-to-top integration testing can iron out the problem of unnecessary bugs. Techniques like PICK matrix, and MoSCoW prioritization are used at this stage.

Solutioning and Implementation

The primary focus of this stage is developing effective solutions to eliminate the problem at its core. All the likely solutions are tested for efficiency, stability, and robustness. This phase will also redefine the scope for stakeholders and their responsibilities to enable quicker identification of cause in the unlikely event of an incident in the future. The solution is then implemented and tested before deployment.

Supervising and Sustainability

Effective resolution always goes one step beyond just providing solutions. It can also ensure smooth operation by monitoring for inconsistencies post integration of the resolution with rest business process. One key aspect to improve sustainability is to regularly monitor the impact of the resolution in terms of its implications towards business objectives.

Root Cause Analysis Tools

Tools used to perform root cause analysis largely depend on the type and complexity of the issue. The tool used is as critical as the problem definition itself. There is no one-size-fits-all solution when it comes to choosing the tool or technique to be employed. For instance, the “5 Whys” technique is most effective when used to tackle simple to moderate problems that have a limited number of causes. This is partly because the “5 Whys” is a linear approach in one direction, which is not likely to suit problems with multiple causes.

We have put together a brief summary of some of the most commonly used RCA tools.

Pareto Chart

A Pareto chart is most commonly used when analyzing data about the frequency of problems. It can also be used in prioritizing and identifying the most significant problem when a number of issues exist.

A Pareto chart is a bar chart or a histogram with a line graph that cuts across depicting the frequency of various problems to portray their relative significance. The bars denote the frequency in descending order and the line displays the cumulative total moving from left to right.

Fishbone Diagram

The Fishbone or Ishikawa method is mostly used to analyze a problem statement or brainstorm the cause of a problem. It can also be used to scrutinize process and quality improvement.

The fishbone technique portrays a visual method to diagnose the problem. It allows focus on the underlying problem rather than the symptoms that lead to it. Fishbones offer a great way to brainstorm within a well-defined structure.

Scatter Diagram

The Scatter diagram method is ideally a follow up to brainstorming sessions of causes and effects using the fishbone method. It is done to determine objectively whether a particular cause and effect are related. It is primarily used while trying to determine if two variables are related or not.

The scatter plot or scatter diagram technique employs the use of pairs of data pointers to reveal relationships between different variables. It is a quantitative methodology that helps determine the correlation between two variables. Normally, a scatter diagram is created by plotting the independent variable along the x-axis, which is the cause and the dependent variable along the y-axis, which would be the effect. If the pattern depicts a clear line to curve, it signifies that the variables are correlated.