How and why static analysis of code saves you and your customers time and money
This content is part # of # in the series: Static analysis IBM Rational Software Analyzer
This content is part of the series:Static analysis IBM Rational Software Analyzer
Stay tuned for additional content in this series.
As people who work in the tech industry, we usually consider ourselves top-notch developers and, for the most part, this is true. However, no matter how well-educated or experienced we are, and despite our best intentions, we continue to write code that has bugs. Most software applications have become so complex that understanding the requirements and writing the source code to address them has become such a difficult operation that it is virtually impossible to do without creating unintended behaviors or those outright defects that we call bugs.
Consider almost any mission-critical applications in your organization, and you will discover that they are constructed from hundreds of classes and many thousands of lines of code. To help understand and implement such complex systems, we in this industry have designed new techniques, such as agile development processes. Yet even with short development iterations and peer code review, large numbers of defects still lurk in the software that we sell to our customers. Even though they are a step in the right direction, our current processes are not enough. More and more, we need the assistance of automated analysis tools.
We have used dynamic analysis tools for decades. Everyone has debugged code with a runtime debugger or found performance hotspots with a code profiler. Yet we use these tools too late in the development cycle to be cost-effective. The best time to find problems is when you review the source code as it is written. With the aid of static analysis tools, much of the heavy lifting can be handled automatically. Bear in mind, however, that static analysis is simply another tool to improve code quality; it is not a complete replacement for manual code reviews.
This article will introduce you to IBM® Rational® Software Analyzer static analysis tools. The goals are to gently ease you into automated code analysis and to present some of the benefits that this can bring to your software development process. Rational Software Analyzer is designed to encourage and simplify the process of automated improvements in code quality. It was initially an API and user interface to create and integrate static analysis tools into other Rational products, such as Rational® Application Developer and Rational® Software Architect, but has since evolved into a complete, standalone offering. Analyzer enables developers to easily access capabilities such as automated code review and structural analysis in their daily development processes.
What we mean by "static analysis"
Static analysis is many things to many people. If you look at the product landscape, you will discover dozens of companies claiming to offer static analysis tools. The market supports so many companies because the notion of static analysis is broad. Some companies focus only on C++ code review, while others offer only software metrics for the Java programming language. Some analyze code for security problems for Web applications, and others scan code for dependency problems. Thus, static analysis is a diverse and confusing concept that needs clarification.
So what is it? Static analysis means the study of things that are not changing. However, in software terms, this definition can be refined as the study of source or binary code that is not currently running. You already know that you need a debugger or profiler to analyze running code, but you can learn a lot from code without ever running a program.
For example, if you simply parse all of the source files for a program, you can ensure that the source code adheres to a predefined coding standard. You can also detect common performance problems, such as calling a method multiple times even though the result it produces does not change. You can even examine the imports of each class to understand what other classes it depends on or which classes depend on it. None of this requires the program to run or even to compile.
Although there are many types of static analysis, they can be broken into a few common categories, based on the value that they provide. There are many other types of static analysis, of course, but Table 1 captures the key types and the forms of static analysis that are the subject of this series of articles about Rational Software Analyzer.
Table 1. Most common static analysis categories
|Code review||This type of tool is typically one that performs automated code parsing, where each source file is loaded and passed through a parser that looks for particular code patterns that violate a set of established rules. In some languages, such as C++, many of these rules are built into the compiler or available in external programs, such as Lint. In other languages, such as Java, the compiler does little in the way of automated code review. Code review is a good tool to enforce coding standards, find basic performance problems, and find possible API abuse. Code review can also include deeper forms of analysis, such as data flow, control flow, type state, and so forth. Some of these are discussed in other articles in this series.|
|Code dependency||Rather than examining the format of individual source files, code dependency tools examine the relationships between source files (typically, classes) to create a map of the overall architecture of a program. Dependency tools are commonly used to discover known design patterns (good) or common anti-patterns (bad) in code.|
|Code complexity||Complexity tools analyze the program code and compare it to established software metrics to determine whether it is unnecessarily complex. If a particular piece of code exceeds a given threshold, it can be flagged as a candidate for refactoring to help improve maintainability.|
|Trending||Trend analysis does not use code artifacts directly. Rather, it is the study of improvements or degradations in code quality, based on other forms of analysis (essentially, it analyzes the results of analysis). Results generated by these tools typically appeal to managers, executives, and customers rather than developers, because they make a statement about the direction of quality improvements, thereby answering the question: "Is the code getting better or worse?"|
Benefits of static analysis
This article has already alluded to some of the reasons for making static analysis part of your development process. To reiterate, there are two basic and compelling reasons to encourage static analysis: to save time and to save money. One aspect of the time savings achieved with static analysis tools should be fairly obvious: It takes you less time to get better-quality code. Many studies, including some conducted within IBM, claim that even simple automated code review will find 5 to 15% of all defects in code.
The same studies claim that a defect reported by one of your customers can cost $12,000 to $18,000 USD. If you consider a typically large piece of software that has a thousand defects during its life span, you quickly realize that using automated code review tools can save $600,000 to as much as $2.7 million USD. Regardless of which percentage or cost you believe in, the potential savings with static analysis tools is staggering.
Certainly, avoiding customer-reported defects is the most obvious way to reduce costs, and, you can typically achieve that with a comprehensive testing process. However, using static analysis tools such as code review gives you a way to reduce costs even more.
We have all seen graphs like the one shown in Figure 1, and few would dispute the rationale behind it. Finding defects earlier in the development process costs less, and by using a simple automated code review, you can start finding defects during the coding phase of a project -- perhaps even while the developer is typing the code.
Figure 1. Comparison of costs to fix defects found at different stages of the development lifecycle
There is another aspect of savings that may not be as obvious, though. So far, we have focused on how developers can save time and money by using static analysis, but what about your customers? Customers are proving more often that they are willing to sue you if your software costs them time or money. You want customers to buy and use your software because you have a reputation for reliability. When you add static analysis to your development process, your customers gain time and money advantages, too. Higher-quality code means that your customers will not lose time waiting for you to fix a defect that they have reported. In addition, their ability to make money will not be hindered while they wait.
Rational Software Analyzer is designed to meet several requirements:
- First, it integrates tightly into an Eclipse, Rational Software Architect, or Rational Application Developer workbench, thus giving developers full access to analyze their code while they are writing it.
- Second, it is available in command-line or ANT task format to support integration into existing build systems. A complete API enables you to use not only built-in analysis techniques and rules, but also to create your own.
- Finally, extracting analysis results and generating reports both in the workbench and in exported forms, such as HTML, enables developers, managers and executives to assess overall code quality.
Specify rules for running the analysis
When Rational Software Analyzer is available in your workbench, you will have new menu and toolbar options in the Java, Debug, C++, and Plug-in Development perspectives. In other perspectives, you may need to manually enable these features:
- From the Eclipse menu bar, select Window > Customize Perspective (see Figure 2).
Figure 2. Customize Perspective option
- When the dialog appears, select the Commands tab. and click the Rational Software Analyzer check box.
- Then accept the change by clicking OK (Figure 3).
Figure 3. Commands tab view
You will see the new static analysis additions to the Eclipse toolbar and menu. These options enable you to create, modify, or run analysis configurations.
- Select the Run > Analysis menu option to display the main analysis configuration dialog.
You will see a dialog very much like the one used to run or debug code from the Eclipse workbench. For simplicity, it has been designed to function similarly to the dialog screens that you already use. You can add or remove analysis configurations by using the buttons in the top-left part of the dialog. As the name implies, a configuration is used to determine which forms of analysis and which rules are executed, as well as the scope of analysis (for example, a project, a working set, or the whole workspace).
- To get started, select the Analysis element in the Configurations list on the left side of the configuration dialog, and then click the New button.
You will notice that the right side of the dialog changes to show the basic configuration interface.
Configure the analysis
The first step in creating an analysis configuration is to specify the default range of resources that you want to analyze. You select the desired range within the Scope tab. The available options currently analyze the entire workspace, a working set, or a set of projects. For this exercise:
- Create a new configuration and, on the Scope tab, select the Analyze entire workspace option, as shown in Figure 4.
Figure 4. Specify the range of your analysis
Under the Rules tab, you specify the forms of analysis that you want to perform. You will notice that this tab displays a directory (tree) where you can select or deselect analysis elements, and that this tab includes additional buttons for importing and exporting rule selections. The top-most nodes of the domain tree are analysis providers, which represent the types of analysis tools that are recognized by the analysis framework. Providers contain categories, which are loose organizations of rules or other categories. Rules perform all of the heavy lifting in the process by defining the conditions that generate results during the analysis.
The check box before each node in the tree controls the enabling state of the element. When an element is selected or deselected, all of its child nodes are set to the same state, which allows for quick selection of entire categories or even the entire tree. For the exercise:
- Select the entire Code Review for Java branch.
Don't worry if the number of rules that you see in the analysis configuration dialog differs from what Figure 5 shows. RCS functionality is available in several Rational products, and the included rule sets vary.
Figure 5. Analysis configuration dialog
Some rules will show additional configuration options. In this case, the lower part of the Rules tab displays any current settings for the rule; otherwise, it is empty. Figure 6 shows a sample list of rule parameters for one of the Java software metrics rules to give you a feel for this. When a rule shows parameters in this way, you can adjust them to suit your needs, and the new values will automatically be stored with the rule selection.
Figure 6. Sample list of rule parameters
Run the analysis and display the results
- To start the analysis, click the Analyze button.
When you do this, you will see the Analysis Results view in the Eclipse workbench. Depending on which kind of analysis you are doing, the results view may differ. Some results views, like the one provided by Java code review, support viewing results in more than one format (a table or a tree, for example).
As Figure 7 shows, if your analysis configuration contained selections for more than one type of analysis (code review and architectural discovery, in this case), the results view will include a tab for each analysis provider’s results.
Figure 7. Java Code Review results view
If you right-click on a result, you can perform special tasks, such as viewing the source code where the problem occurred or "quick fixing" the problem with an automatic result, which is trivial if the rule author has provided a quick fix routine for the rule.
- If the Quick Fix menu option is enabled, selecting it will walk you through a process to correct the problem.
Figure 8. Quick Fix option
It is important to note that the viewer used to render a result is a function of the type of data that it contains. When viewing results, you might see a source file opened in the editor with highlighted text, or a UML diagram, or a table of statistical data. There is really no common way to view a result; this is determined by the author of the rule. For the Java code review analysis provider, all results are viewed as editable Java source files.
Export and report
There are a couple of other common functions that may be available in the results view, as well (this depends on the type of analysis that you are performing). These are in the form of data exporting and reporting.
Data export, as the name implies, allows you to export the raw analysis results to a file, typically an XML file format. The type of data exported has been determined by the analysis provider, which will supply a list of known data exports and allow you to select which one to perform (see Figure 9).
Figure 9. Analysis Reporting view
In many ways, reporting is similar to exporting data. In fact, both functions share exporters. However, reporting generates nicely formatted pages that can be stored locally or written directly to a remote Web site. You can take any existing report file and modify it to suit your needs (add company logo, for instance). The generated report will resemble those shown in Figures 10 and 11, but because the reporting engine is quite flexible, the other variations are available.
Figure 10. Java code review
Figure 11. Java code review Severity Summary illustrated by a pie chart
(Optional) Create custom rules and categories
In addition to the rules supplied by Rational Software Analyzer and any rules contributed by third-party developers, you can create custom categories and custom rules from templates, without writing any code. To create new custom rules and categories:
- Go to the Preference page by selecting Window > Preferences.
- Then, in the Preferences tree, select Analysis > Custom Rules and Categories (see Figure 12).
Figure 12. Custom Rules and Categories view in Preferences
- Click Add Category to add a new custom category.
This takes you through a simple wizard, where you can choose the parent category and a name for the new one. The tree control for custom categories will show the complete path of categories for any custom category. Only previously defined custom categories can be deleted.
- Click Add Rule to start the rule-creation wizard.
On the first wizard screen, you can select where the rule will be located in the analysis category tree.
- On the first wizard page, select a category and then click Next.
- On the second wizard page you will see a list of all rule templates. Select the rule template that you want to use as the basis for your new rule.
Notice that not all analysis providers support custom rules. However, Java Code Review supplies several that are at your disposal.
Figure 13. Templates available for creating custom rules
On the final wizard page, you will see entries for each parameter defined in the rule template. In the example shown in Figure 14, the selected rule template defined only one parameter; therefore, you can enter only a qualified class name in the field provided.
- Either use the Browse button to browse to an existing class, or manually enter a valid class name in the text box.
Figure 14. Assign Template Values (parameters) view
- Click the Finish button to create the template-based rule and add it to the rule tree.
You can select this rule as part of any analysis configuration hereafter.
This first article of a four-part series introduced static analysis in general terms and then explained key features of Rational Software Analyzer, which is designed to help you find code-quality problems early in the development cycle. In Part 2, you can take a closer look at the Java code review capabilities of Rational Software Analyzer. This will include an in-depth study of basic rule authoring, using the supplied API, and more advanced features that you can use to create rule templates or rules with variable data.
- Visit the Rational Software Analyzer area on developerWorks for introductory to in-depth information.
- In part 2 of this series, learn about Creating rules and rule filters to extend Java code review.
- Part 3 of this series offers suggestions on Enhancing rules for Java code review.
- Part 4 of this series discusses techniques for Integrating your own analysis tools.
- Explore Rational computer-based, Web-based, and instructor-led online courses. Hone your skills and learn more about Rational tools with these courses, which range from introductory to advanced. The courses on this catalog are available for purchase through computer-based training or Web-based training. Additionally, some "Getting Started" courses are available free of charge.
- Subscribe to the Rational Edge newsletter for articles on the concepts behind effective software development.
- Subscribe to the IBM developerWorks newsletter, a weekly update on the best of developerWorks tutorials, articles, downloads, community activities, webcasts and events.
- Download trial versions of IBM Rational software.
- Download these IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Tivoli®, and WebSphere®.