Failure mode and effects analysis (FMEA) is a structured failure mitigation framework that identifies all possible failures for the components of a design, manufacturing process, service or product before they occur.
FMEA analyzes the causes of failures, infers the effects of failure and prioritizes corrective actions based on a risk assessment ranking system. Through focusing on the risks with the highest potential impact, organizations can efficiently improve reliability, reduce the costs of failure and strengthen operational resilience.
FMEA’s step-by-step procedure helps organizations proactively increase reliability and manage risk by focusing on the process failures with the highest projected impacts. Organizations can use FMEA to evaluate both existing processes and new ones before implementation, leading to stronger quality controls, enhanced brand perception and greater customer satisfaction and lifetime value.
Organizations are augmenting FMEA with AI in operations management to improve failure prediction and automate risk detection by using real-time operational data. FMEA is critical for risk governance, enabling organizations to react in advance to disruptions and make reliability a core component of service and product lifecycle management.
Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think Newsletter, delivered twice weekly. See the IBM Privacy Statement.
Various approaches to potential failure mode and effects analysis (FMEA) have been developed to cover a range of industries and processes. Different types of FMEA apply to specific stages of the product and operational lifecycle, giving organizations the necessary frameworks through which they can apply risk analysis for maximum strategic value.
The types of FMEA include:
FMEA guides organizations through a standardized, step-by-step methodology. Initially developed by the US military in the 1940s, it is now recognized globally as defined in the 2019 AIAG-VDA FMEA Handbook.
AIAG-VDA FMEA combines the best practices of the US’s Automotive Industry Action Group (AIAG) and the German Association of the Automotive Industry (VDA). FMEA standardization facilitates consistency in risk evaluation and quality governance.
The seven AIAG-VDA FMEA steps are:
The first step in FMEA is to lay out a plan for the entire sequence and establish the scope of the study. Planning ahead aligns organizations and stakeholders, facilitates more effective quality outcomes and situates risk analysis within broader business objectives.
The planning phase itself is divided into five stages:
Structure analysis outlines the process steps being analyzed for failure potential. The team generates a flowchart for the process that shows its role in the greater system as well as any subsystems that stem from it. Structure analysis yields system-level visibility that can reveal interdependencies and help prevent cascading failures.
Workflow diagrams are often used to facilitate structure analysis. Through process mapping, the system, product or process is broken down into individual components or subsystems.
Structure analysis reveals the interdependencies between process steps—the components or subsystems—and shows how a single point of failure can affect the overall system or product.
Function analysis establishes the purpose of each stage in the structure as outlined in the previous step. The team describes the function of each subsystem or component and traces the functional relationships between them.
Having a granular, functional understanding of the system or product will help the team members with brainstorming failure modes and conducting root cause analysis in later FMEA steps.
Function analysis helps bridge the gap between how a product or service works and whether it suits business requirements and customer needs.
The fourth step of the FMEA process tasks the team with identifying and describing potential failures for each function from the third step:
Failure analysis compiles a reusable knowledge base of failure modes, accelerating future risk assessments.
After establishing all possible failure states, the team assesses the level of risk presented by each failure according to three metrics:
Severity, occurrence and detectability are typically ranked 1–10. Risk analysis turns potential failures into actionable priorities.
Traditional FMEA methods determined risk rank through the risk priority number (RPN). The formula for calculating RPN multiplies severity, occurrence and detectability:
RPN = S x O x D
AIAG-VDA FMEA introduced the concept of action priority (AP), a new risk rank method that characterizes risks by the urgency of corrective action required. This change strengthened the connection between risk severity and the need for corrective action, rather than asking teams to make that connection themselves.
AP assigns each failure one of three values:
Teams calculate AP by consulting tables that contain various combinations of values for S, O and D. In a pivot toward a greater focus on safety, AP classifies any failure with an S value greater than or equal to 9 as high risk, regardless of its O and D values.
After prioritizing risks, the team identifies and plans corrective actions to eliminate risks in order of rank. Optimization involves lowering AP scores to more acceptable levels through making risks less severe, less frequent or more easily detectable. Corrective actions should lead to safety and reliability improvements.
Optimization is heavily dependent on cross-functional collaboration. Quality management hinges on process controls in design, manufacturing and other business functions. By systematically implementing corrective actions, organizations can reduce costs, improve reliability and shorten time-to-market.
The final stage of the FMEA process is where teams document their findings. Failure points, risks, corrective actions and outcomes all have a place in the FMEA worksheet. The results phase helps centralize and further deploy FMEA insights across the organization.
Thorough documentation helps ensure organization-wide consistency and accountability and informs the creation of subsequent quality control plans and follow-up procedures, such as reliability-centered maintenance (RCM) plans. The benefits of RCM include better planning, reduced preventative maintenance and lower mean time to repair (MTTR).
Documentation also contributes to ongoing regulatory compliance. Organizations must often provide proof of risk identification and mitigation to earn required industrial certifications.
FMEA delivers numerous benefits at the enterprise level above and beyond quality improvement. These benefits include:
Implementing FMEA with digital tools like artificial intelligence (AI) equips organizations for predictive failure modeling and scalable continuous improvement.