Home Topics Fault tree analysis What is fault tree analysis (FTA)?
Conduct fault tree analysis with IBM solutions Subscribe to Sustainability Updates
Illustration with collage of pictograms of face profile, leaf, cloud
What is FTA?

Fault tree analysis (FTA) offers one approach to root cause analysis, identifying and analyzing the root of asset issues before equipment breaks down. FTA helps in manufacturing facilities, where understanding the potential causes of system failures is crucial to preventing them.

Fault tree analysis is a deductive, top-down approach to determining the cause of a specific undesired event within a complex system. It involves breaking down the root cause of a failure into its contributing factors and representing it through a graphical model called a fault tree, which helps managers and engineers identify potential failure modes—and the probability of each failure mode—for safety and reliability analyses.

First developed in the early 1960s by Bell Laboratories to help the US Air Force understand potential flaws in the Minuteman missile system, FTA has been widely used across various industries, including the aerospace, nuclear power, chemical and automotive sectors, among others.

Maintenance managers might use fault tree analysis to:

  • Design and/or install a new system.
  • Make changes to existing systems.
  • Investigate system safety or system reliability.
  • Assess regulatory compliance.
  • Optimize maintenance budgets.

As manufacturing environments continue to evolve and become more complex, the need for effective risk management tools like FTA becomes increasingly important. Incorporating fault tree analyses into your organization's safety analyses and reliability engineering practices can help your organization gain deeper insights into potential causes of system failure. FTA can also help improve overall performance and reduce the likelihood of costly and potentially catastrophic incidents.

Delve into our exclusive guide to the EU's CSRD

With ESG disclosures starting as early as 2025 for some companies, make sure that you're prepared with our guide.

Related content

Register for the ebook on ESG reporting frameworks

Performing a fault tree analysis

Performing a fault tree analysis is a complex process that involves seven key steps.

Step 1: Define the undesired event

Before running your analysis, you should clearly define the undesired event you want to analyze. This event should be specific and measurable, like a component failure or a system malfunction. It’s also important to define the event in clear, consistent terms, since it serves as the starting point for your fault tree diagram.

Step 2: Identify the contributing events and factors

Once you define the undesired event, you should start to identify the factors and events that might contribute to its occurrence. Contributing factors tend to fall into two broad categories: basic events and intermediate events.

Basic events—those events that cannot be further broken down into simpler events—are the most fundamental events in a fault tree, representing the lowest level of events you can analyze. A basic event in a fault tree for a car accident, for example, might be "the driver loses control of the vehicle".

Intermediate events are located between the lower-level basic events and the top event (the primary undesired event being analyzed). Intermediate events are caused by other events in the fault tree and, in turn, cause other events. They represent higher-level events that can be analyzed further. Using the same car accident as an example, an intermediate event in the fault tree might be "tire blows out".

Be sure to consider both internal and external events, like component failures, human error and environmental conditions. You might need to consult with subject matter experts, and/or review of historical data, incident reports and maintenance records, at this stage of the analysis.

Step 3: Construct the fault tree

Using standard gate symbols and event symbols, construct a graphical representation of the relationships between the undesired (or output) event and its contributing factors (also called input events). The fault tree should be organized hierarchically, with the undesired event at the top and the contributing factors branching out below it.

Laying out basic events is straightforward, since basic events cannot produce other events. However, including intermediate events is a bit more complex, as intermediate events require Boolean logic gates that indicate the relationships between top-level, intermediate and basic events.

There are two main types of logic gates used in fault trees: AND gates and OR gates.

  • AND gates: Use AND gates when all contributing events must occur simultaneously for the undesired event to occur. For example, if a system failure requires both a component failure and an operator error, an AND gate is used to connect the events in the fault tree.
  • OR gates: Use an OR gate when any one of the input events is sufficient to cause the output event. In other words, the output event happens if at least one of the input events connected to the OR gate happens. If, for instance, a system failure might result from either a component failure or an operator error, an OR gate would be used to connect the events.

Though less commonly used, NOT gates, XOR gates, K/N gates and INHIBIT gates can also help identify specific relationships between input and output events.

  • NOT gates: NOT gates represent the inverse of an input event. If the input event does not occur, the output event will occur. These gates are less common in fault tree analysis, since they model the absence of an event or the occurrence of a complementary event.

  • XOR gates (Exclusive OR gates): Use an XOR gate when exactly one of the input events must occur for the output event to happen. If none or more than one of the input events occur, the output event will not happen.

  • K/N gates: K/N gates, also known as voting gates or threshold gates, are used when a specific number of the input events (K) out of all the possible input events (N) must occur for the output event to happen. K/N gates can help you illustrate more complex relationships in a fault tree analysis.

  • INHIBIT gates: Like an AND gate, an INHIBIT gate indicates that an output event will occur if both input events and a conditional event (a condition or restriction that can apply to any gate) occurs.

Intermediate events can also include undeveloped events, which are events that aren’t fully understood or haven’t been fully analyzed.

Using the various available gates will help you create a comprehensive fault tree that captures the complex interactions between the various events and factors that precipitated the undesired event.

Building a fault tree is an iterative process, so you continue to break down contributing events into their basic sub-events until the events cannot be parsed out any further. As you get new information and/or system conditions change, you might need to make several adjustments to refine the fault tree.

Step 4: Gather failure data

In order to quantify the risks associated with the undesired event, you need to gather failure data (from historical records, industry databases, expert opinions, etc.) for the basic events in the fault tree. The failure data should be expressed as failure probabilities or failure rates, depending on the type of analysis you’re conducting.

Step 5: Perform the analysis

Once you construct the fault tree and gather the failure data, you perform the analysis, wherein you calculate the probability of the undesired event occurring and identify the most critical contributing factors. Use either a qualitative or a quantitative data analysis method.

A qualitative analysis focuses on understanding the structure of the fault tree, the relationships between events, and the identification of critical paths and minimal cut sets (the smallest set of events that can create the undesired event). Qualitative analysis can help prioritize remedial actions and identify areas for further investigation.

A quantitative methodology, on the other hand, involves calculating the probability of the undesired event occurring based on the failure probabilities of the basic events. Quantitative analysis can help inform risk management decisions and evaluate the effectiveness of proposed improvements.

Step 6: Interpret the results

After performing the analysis, it’s time to interpret your results and communicate any relevant information to the necessary stakeholders.

The results of an event tree analysis depend on the quality of the input data and the assumptions made during the analysis. As such, you should view the results as a starting point for further investigation and validation, rather than a definitive conclusion.

Step 7: Implement improvements and monitor progress

Based on the findings of the fault tree analysis, you implement preventive measures and improvements as necessary to eliminate or decrease the likelihood of an undesired event. Therefore, be sure to monitor the performance of these improvements and continually update the fault tree to reflect any changes in system design, operating conditions or component performance, so that your tree remains accurate—and useful—to your organization.

Benefits of fault tree analysis
  • FTA provides a visual depiction of contributing factors and events that can lead to a system failure, making it easier to understand complex interactions between system components.

  • FTA allows you to calculate of the probability of a failure event occurring, enabling better risk management and decision-making and helping teams be proactive about corrective actions.

  • Since you can analyze only one output event at a time, fault tree analysis helps teams stay organized as they assess system levels and work through effects analyses methodically.

  • Unlike other approaches to failure mode and effects analyses (FMEAs), FTA accounts for human error, which can help teams understand whether issues are related to deviations from standard operating procedure.

  • FTA identifies which failures are likeliest to occur, helping teams decide which issues require urgent attention.

Limitations of fault tree analysis
  • The accuracy and effectiveness of FTA relies heavily on the expertise of the analysts, their ability to identify relevant causes of failure, and their understanding of the complexities of the fault tree itself.
  • FTA is best suited for smaller system analyses. Large, complex systems typically requires large, complex fault trees, making analysis time-consuming and challenging.
  • Failure data availability and quality determine the precision of the calculated probabilities in a fault tree.
  • Fault tree analysis allows you to examine only one top event at a time.
Fault tree analysis products
Asset management IBM Maximo® Application Suite

Intelligent asset management, monitoring, predictive maintenance and reliability in a single platform.

Learn more about IBM Maximo Application Suite Take a tour of IBM Maximo

Observability IBM Instana® Observability

Enhance your application performance monitoring to provide the context you need to resolve incidents faster.

Learn more about IBM Instana Observability Try IBM Instana

Fault tree analysis resources What is a root cause analysis?

Learn about different tools and methodologies to conduct root cause analyses and address issues quickly.

Preventive maintenance versus predictive maintenance

Explore the differences between preventive, predictive and reactive maintenance approaches.

Transform your business with intelligent EAM

Wield greater control of complex asset environments by learning how intelligent enterprise asset management can help your bottom line.

Take the next step

Unlock the full potential of your enterprise assets with IBM Maximo Application Suite by unifying maintenance, inspection and reliability systems into one platform. It’s an integrated cloud-based solution that harnesses the power of AI, IoT and advanced analytics to maximize asset performance, extend asset lifecycles, minimize operational costs and reduce downtime.

Explore Maximo Book a live demo