IBM Support

Defect Prediction using SPSS Modeler

Technical Blog Post


Abstract

Defect Prediction using SPSS Modeler

Body

Abstract:

One of the major challenges in Software Development Lifecycle (SDLC) is the amount of defects that is found both during the test cycle and post the software release. Having a zero defect code is a myth and so the effort has always been to reduce the number of defects, be more prepared to handle the defects and faster turnaround cycles in terms of defect fixes, providing fix pack releases to customers and so on. In this article, we will look at how defect prediction using SPSS can help address some of the above areas.

 

Introduction:

SPSS Modeler is a predictive analytics platform that is designed to bring predictive intelligence to decisions made by individuals, groups, systems and the enterprise. It provides a range of advanced algorithms and techniques, including text analytics, entity analytics, decision management and optimization, to help you select the actions that result in better outcomes.

SPSS Modeler is used to build predictive models and conduct other analytic tasks. It has a visual interface, which allows users to leverage statistical and data mining algorithms without programming.

 

Why prediction?

Each defect discovered post the product release is a huge cost and impacts customer experience and requires extensive development effort. If there is data around the areas where more defects are expected, or areas where regressions are bound to be more, the teams can be better prepared in terms of planning, automation, resource management and so on.

 

Scope of prediction

A few of the prediction scenarios that can be further modified as per the type of application, industry and impact of defects and so on include:

  • From which functional area can we expect more number of defects from the next release?
  • From which functional area can we expect more number of regression defects from the next release?
  • From which functional area can we expect high severity defects in the next four months?
  • Which customer can raise the highest number of PMRs for the next release?

SPSS Implementation

SPSS Modeler offers a variety of modeling methods taken from machine learning, artificial intelligence and statistics. The methods available on the modeling palette enable you to derive new information from your data and to develop predictive models. Each method has certain strengths and is best suited for particular types of problems.

 

Data preparation:

Data preparation is one of the most critical and often, the time-consuming aspect in all data predictions. For this case study, we used defect data for a software product, for the last four years, for a set of identified components. The prediction was about incoming defects in the next six months in this Product/Component.

The sample of data that we input to the Modeler:

We developed a model using this data.

Next steps:

  1. Upload the excel sheet to modeler,

  • Read the values of the sheet and assign the Input and Target for the values using the TYPE filed.

  • Forecast the data for the next four months using the TIME INTERVALS field.

 

  • Generate predictive report using the Time Plot Model. There are many others such as - <>,<>,<> and so on and based on the use case the appropriate model must be selected.

  • Graphical representation:

  • Tabular representation of the data:

 

Prediction result:

The column, “$TS-No of Defects” represents the expected number of defects expected in the next four months.This gives an early start to teams in terms of resource management, skills, estimation and helps with quick turnaround on fixes and releasing patches.

 

Conclusion

Modeler is an easy-to-use application that puts the power of predictive modeling in the hands of business users. Using predictive models, you can identify patterns based on what has happened in the past and use them to predict what is likely to happen in the future.

 

  1. Automate decisions using business rules and add insight using predictive models.
  1. Use prioritization, optimization, or simulation to reach the best decision based on the above.
  1. Conduct analysis regardless of where the data is stored and regardless of whether it is structured or unstructured.
  1. Solve a variety of business problems with an extensive range of analytics.

Next steps and recommendations

Modeler User Guide:

http://www-01.ibm.com/software/analytics/spss/products/modeler/

-------

Author: Mrs. Shruthi V Gowdru

Job Title: Software QA Engineer

Email: shgowdru@in.ibm.com

Bio: Shruthi, working as a Software Engineer (QA) for IBM Policy Atlas and SIQ for the Legal team under ECM. She has 4.5 years of experience in QA.

Company: IBM

 

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSCTJ4","label":"IBM Case Manager"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

UID

ibm11281040