IBM Support

Intelligent Finding Analytics (IFA) Server for AppScan Source

Question & Answer


Question

How do I take advantage of auto-triage and analysis of findings from AppScan Source?

Answer

Intelligent Finding Analytics (IFA)

 


Static Analysis Security Testing (SAST)

 

 

  • Static Analysis Security Testing (SAST) is a powerful way to identify potential security flaws in a program.

    While techniques are varied, SAST tools are designed to produce security warnings representing potentially risky pathways through a program’s source code. Risky pathways typically involve receiving user-controlled data that then makes its way through the source code and ending up outside the program, such as a website form to make a purchase making it into an orders database table. The reverse is sometimes true as well: data can go from the database back out to the user.

    SAST tools present a security warning, or finding, when the user data is not modified to remove attacks (sanitation) or is not checked against a list of known good characters (validation or white listing). The absence of sanitization or validation triggers a finding to be flagged as potentially damaging. Such potentially damaging pathways and findings are what SAST tools try to find within source code.

    SAST tools are thorough and deep in their assessments. This is both a strength and a challenge for the tools; findings by the thousands are not uncommon for a single program and comprehensive review of those findings can be onerous. Traditionally there are two ways to manage such large sets of findings, as follows:
    • Scan less of the attack surface of the program, and thus have fewer findings to review and increase the risk of important findings not being identified.
    • Add more people to review the results.

    Intelligent Finding Analysis (IFA) offers an additional approach to managing large sets of findings from SAST assessments.


Introducing Intelligent Finding Analysis (IFA)

 

 

  • Intelligent Finding Analytics (IFA) is a novel new way to absorb the findings of SAST tools so that you can more efficiently form an action plan for resolution based on those findings.

    IFA identifies the most actionable findings from an SAST security assessment. IFA defines as actionable as both real vulnerabilities and also findings that have a higher probability of exploit. IFA is 95-98% accurate when discovering whether or not a finding is indeed actionable.

    Reductions of 98-99% of the original finding counts are common when applying IFA to SAST analysis; this reduces the workload for not just finding security issues, but resolving them as well. Application of IFA to SAST findings typically turns a security assessment with 10k findings down to a few hundred for humans to review.






  •  
  •  
  •  
  •  
  •  

Operation

 

 

  • IFA runs in two phases, as follows:
    • Phase 1 uses pre-defined exclude filters to set findings as not interesting. This list of filters can be modified or augmented.

      Findings which are not interesting are set with the excluded flag and can be viewed in the Excluded Findings view in Source for Analysis.
    • Phase 2 uses supervised machine learning techniques on all remaining findings to determine if a finding is actionable.

      Findings which are interesting and/or actionable can be viewed in the Findings View in Source for Analysis.

    In all cases, IFA adds a note describing why the finding was either included or excluded in the final IFA assessment. View notes in the Finding Detail View or the notes column in the finding tables present in other views.


Interpreting the IFA results

 

 

  • IFA is based on statistical probability. Machine learning helps frame the probability of the security impact of a finding based on the training set currently in use. IFA represents probability of security impact in three ways, as follows:
    • the percent the finding is interesting
    • the percent the finding is a certain severity
    • percent a finding is not interesting

    Probabilities are between 0% and 100%. The closer to 100%, the more confident IFA is in the validity of the response.

    Probability can be used as a guide during the manual review of IFA findings. The notes field of each finding has information on the probabilities that apply to the finding itself. View notes in the Finding View or the notes column in the various finding tables.


Modifying pre-filters for phase 1

 


  • The pre-filtering mechanism uses exclusion type filters only and follows the format currently used in IBM AppScan Source. Exclude filters are located here:

    <UNZIP DIRECTORY>\wlp\usr\servers\ifa\appscan\ml\scan_filters\exclude

    The directory includes a general directory as well as a number of programming language specific subdirectories. Filters placed into the general directory apply to all assessments of any language submitted to IFA. Filters placed into the language-specific subdirectories apply to assessments for that specific language only.

    You can copy the filters used in IBM AppScan Source for Analysis to the proper directory and they will be applied to exclude findings globally if in the general folder, or for findings of a specific language if they are placed in the language specific folder. Filters must exclude non-inverted only or they will not be applied.

    Filters of specific interest are as follows:
    • All vulnerabilities are listed in vulnerabilities.off, located at:

      <UNZIP DIRECTORY>\wlp\usr\servers\ifa\appscan\ml\scan_filters\Vulnerabilities.off
    • Vulnerabilities currently excluded during IFA as not interesting are listed in IFA1001.off, located at:

      <UNZIP DIRECTORY>\wlp\usr\servers\ifa\appscan\ml\scan_filters\exclude\general\IFA1001.off
    Modify lists by removing lines representing the vulnerability to be considered for the IFA machine learning process.

    See also Recent Updates at the end of this document.






  •  
  •  
  •  
  •  
  •  

Modifying the training set for phase 2

 

  • The machine learning employed during phase 2 of IFA uses a training set to build a classification model. The model is then used to perform predictions for one of four classifications, as follows:
      • high actionable
      • medium actionable
      • low actionable
      • not interesting

    IFA phase 2 assigns each classification a prediction value between 0 and 1 that represents the probability the machine is correct in the classification based on this training set. The system then looks at the resulting probabilities for each classification for each finding and chooses a response by selecting the highest resulting probability.

    The training set “learns” by manually classifying new assessments. IFA reads the classification from the assessment files using the following algorithm in order of preference:
      • Notes
        • 1=High
        • 2=Medium
        • 3=Low
        • 4=Not Interesting
      • Modified Severity
        • High=High
        • Medium=Medium
        • Low=Low
        • Info=Not Interesting
      • Severity
        • High=High
        • Medium=Medium
        • Low=Low
        • Info=Not Interesting
      • Excluded Findings=Not Interesting

    Ideally the severity is adjusted to match what is desired for classification as it is the easiest path. Any findings deemed not interesting can be excluded; they will be applied to the training set as not interesting findings.

    The training assessments are all stored at:

    <UNZIP DIRECTORY>\wlp\usr\servers\ifa\appscan\ml\spark\train

    Any modification to the factory set or addition of a new assessment triggers IFA to rebuild the models for prediction. Restart the server to take advantage of the updated training files.


Modifying finding severity

 

 

  • IFA modifies the severity of some of the findings if it is confident enough in the prediction. The confidence comes from the probability IFA uses to determine if a finding should be interesting. If this probability is above 50% IFA adjusts the severity of the finding to the selected severity.

    The probability threshold can be configured through the setting sev_adjust_threshold in <UNZIP DIRECTORY>\wlp\usr\servers\ifa\appscan\config\ifa-server.apsettings. The acceptable values are between 0 and 1. The default is set to .5, or 50%.

    Fix groups
    SAST assessment findings can be numerous and generally involve pathways through library APIs using arguments or return values. In many cases, an API with two or more arguments will have the same entry point, flow through a different one of the arguments and flow out to the same exit point. Each argument then would represent a different node in a different finding. If there are multiple APIs following this pattern, the SAST returns even more findings that all try to say the same thing: this API is a crossroad in the traces produced.

    Fix groups attempt to find these crossroads and collect like findings together. The collection of findings that result share one or more common locations, or crossroads, which helps in the resolution process. This is similar to tasks a security analyst performs when bundling findings together for a defect. Findings which share nodes in common typically are findings which are saying the same thing to the user. By locating and collecting such finding into groups it becomes easier to review the findings and understand what they are trying to say.

    Change in a software system invariably creates different behaviors, some which are good and others that are bad and become defects. Minimizing change in a system for the desired behavior becomes important; fix groups themselves are intended to be a guide to resolving the security warnings with the least amount of change. Fix groups bundle findings together that share at least one commonality and present a more complete picture of the security health of a program.


Usage vs implementation fix groups

 

 

  • There are two kinds of fix groups in IFA, as follows:
    • Usage. Usage groups involve using third-party code that represents the common area in the code base through which many findings flow. In effect, the commonality is through the “use” of a third-party API. Typically a sanitizer around the return value is used; however in some cases the arguments going in to the third-party API are what need to be cleansed.
    • Implementation. Implementation groups are the implementation of user code represented by an API which is modifiable by the user and represents the common node. The developers of the application should know what kind of data is allowed in different arguments in most cases. A simple whitelist of those known characters, or a pattern match for the right patterns, is typically where the code should be cleansed.


Interpreting the fix group results

 

 

  • The fix groups are returned as a list of assessments. The names of each assessment file indicate contents and use the format <FINDING COUNT>_<VULNS>.ozasmt. Some filenames include unique denoting the findings only have the vulnerability in common. The node which represents the most common note is set as the application name of the assessment viewable in the My Assessments view.


Delta analysis

 

 

  • Delta analysis of fix groups presents the difference between two assessments. The older assessment is used as the baseline and compared to the new assessment. The timestamp attribute within the assessment is used for the date calculation, not the file time stamp.

    There are two functions available within the delta analysis: new and missing. Each result returns an assessment containing the findings from the requested delta. The new delta represents those findings which only exist in the newer assessment and the resolved delta represents those findings which are only in the baseline assessment. Any two assessments derived from any source can be used with this call.


IFA Server

Acquiring the IFA Server

  • To acquire the Intelligent Finding Analytics Server, contact your IBM representative.


Recommended IFA Server requirements

 

  • Operating Systems AppScan Source for Analysis supports
  • Java 1.8 64 bit
  • 16GB of memory (8GB minimum)
  • 20GB of disk space (4GB minimum)

 

Supported languages

IFA supports the following languages:

  • Java
  • .NET
  • C/C++
  • ObjectiveC
  • PHP
  • Visual Basic
  • ASP
  • Arxan Android
  • Arxan iOS
  • Pattern-based

 

IFA does not support the following languages:

  • COBOL
  • ColdFusion
  • JavaScript
  • NodeJS
  • Perl
  • Python
  • PL/SQL
  • TSQL
  • WSDL


Server setup

  • To setup the IFA Server, unzip the package.

    On Linux or Mac, use chmod +x wlp/bin/server to start the server.

    To configure the ports, review the server.xml file in <UNZIP DIRECTORY>\wlp\usr\servers\ifa\server.xml. The ports are controlled by the httpPort and httpsPort attributes of the httpEndPoint element.

    • <server description="new server">
      <!-- Enable features -->
      <featureManager>
      <feature>localConnector-1.0</feature>
      <feature>jaxrs-2.0</feature>
      <feature>servlet-3.1</feature>
      <feature>jsp-2.3</feature>
      <feature>ssl-1.0</feature>
      </featureManager>
      <keyStore id="defaultKeyStore" password="{xor}Pi8vLDw+MQ==" />
      <!-- To access this server from a remote client add a host attribute to the following element, e.g. host="*" -->
      <httpEndpoint httpPort="9080" httpsPort="9443" id="defaultHttpEndpoint" host="*"/>
      <webContainer deferServletLoad="false"/>
      <applicationMonitor updateTrigger="mbean"/>


      <webApplication id="IfaServer" location="IfaServer.war" contextRoot="rest" name="IfaServer"/>
      </server>

    The IFA server is based on Open Liberty 18.0.0.1.


Installing the IFA Server as a Windows service

 

 

  • To install the IFA server as a Windows service:
    1. Download the Apache Commons daemon from:http://www.apache.org/dist/commons/daemon/binaries/windows/
    2. Unzip the binary file.
    3. Copy prunesrv.exe from the directory into which the daemon was unzipped to wlp\bin.
    4. Use the following test in a batch file to add the service, where IFA_INSTALL_DIR is the directory location of the unzipped IFA server files:
    • @echo off
      set IFA_INSTALL_DIR= <IFA Server unzip directory>

      "%IFA_INSTALL_DIR%\wlp\bin\prunsrv.exe" //IS//IFA_LIBERTY --Startup=manual --DisplayName="IFA Server" --Description="Intelligent Finding Analytics Service" ++DependsOn=Tcpip --LogPath="%IFA_INSTALL_DIR%\wlp\usr\servers\ifa\logs" --StdOutput=auto --StdError=auto --StartMode=exe --StartPath="%IFA_INSTALL_DIR%\wlp\usr\servers\ifa" --StartImage="%IFA_INSTALL_DIR%\wlp\bin\server.bat" ++StartParams=start#ifa --StopMode=exe --StopPath="%IFA_INSTALL_DIR%\wlp\usr\servers\ifa" --StopImage="%IFA_INSTALL_DIR%\wlp\bin\server.bat" ++StopParams=stop#ifa

    Notes on the IFA Sever as a Windows service:
    • The service is initially set up to start manually. You can change the service to start automatically at Services > IFA Server > Properties.
    • The user the service uses is “system.” This should be changed immediately to a low-privilege user with limited rights at Services > IFA Server > Properties > Logon tab. Any changes associated with the user password should be noted here as well.


Start/Stop

 

 

  • On Linux or Mac, add the execute bit to the server file:
    cd < UNZIP DIRECTORY>
    chmod +x wlp/bin/server

    To start the Liberty server in the foreground:
    cd < UNZIP DIRECTORY>
    wlp/bin/server run ifa

    To stop the Liberty server in the foreground:
    ctrl + C

    To start the Liberty server in the background:
    cd < UNZIP DIRECTORY>
    wlp/bin/server start ifa

    To stop the Liberty server in the background:
    cd < UNZIP DIRECTORY>
    wlp/bin/server stop ifa


Logs

 

 

  • The logs for both the Liberty server as well as the servlet are collected and placed in < UNZIP DIRECTORY>\wlp\usr\servers\ifa\logs\messages.log. If there is an exception which is not handled by the server that log will go into the ffdc folder.






  •  
  •  
  •  
  •  
  •  

Helpful links

 

 

  • The IFA Server takes one to three minutes to initialize. Once initialized, there are some helpful links that should be the first stop for a new installation:
    • Version check:

    • curl "http://localhost:9080/rest/ifa/v1/version"
      Response: b250c8c1-e6ce-4f61-b534-6bd383ca4a14
    • Health check:

    • curl "http://localhost:9080/rest/ifa/v1/health"
    • REST documentation:
    curl "http://localhost:9080/rest/docs/index.html"






  •  
  •  
  •  
  •  
  •  

Configuration

 

 

  • The configuration file for the IFA Server is located in:

    < UNZIP DIRECTORY>\wlp\usr\servers\ifa\appscan\config\ifa-server.apsettings

    Any change to the settings requires a restart of the Liberty server.
      • debug_ifa_server
        Leaves job data on the file system after a job is complete. This setting is useful for debugging upload payloads. The default value is false.
      • retention_days
        Sets the number of days the file system retains job results. The default is 30 days.
      • include_custom_high
        If an uploaded assessment is supplied for IFA and has a modified severity of high, IFA automatically adds that finding to the returned result.
      • json_logging
        Transforms the logs produced by the servletoperations into JSON format for easier consumption by aggregation programs. It does not affect the Liberty-specific logging. The default value is false.
      • log_level
        Sets the log level for the server.


Estimating disk space

All of the IFA data, along with the server itself, resides in the same file system location where the unzipped server file resides. As such, you need to ensure that you have enough disk space for storing IFA scan data as well as for the server itself.

IFA server file sizes are as follows:
· Zipped: 325MB
· Unzipped: 1.37GB

When you unzip and install the server, the process evaluates available space. If you have less than 1GB of available space, health check will fail. You can still unzip the server and run scans, but you are more likely to encounter odd behaviors as disk space dwindles.

Each IFA job – IFA, fix groups, and delta analysis – requires disk storage that is twice the assessment size. Because a single assessment can range in size from 500KG-50MB+ it can be hard to estimate needed storage on a jobs per GB basis. However, conservatively using a 1MB original assessment size, you can calculate there will be roughly 2MB of storage space needed per job and 500 jobs per GB. Additionally, it is not uncommon for a single assessment to use three different and complete jobs to get the final bit of data. Given that, there are roughly 166 assessments which can be run through the whole IFA process per GB.

As such, we recommend a minimum of 4GB of free space available before attempting to unzip and install the server file for adequate storage of both the server and associated IFA data. A minimum of 4GB of disk space allows for about 1.3GB of scan data (approximately 200 standard assessments) before encountering disk space issues.

Estimating throughput

Average processing of a single 1MB assessment is as follows:
· IFA: 30 seconds
· Fix grouping: 15 seconds
· Delta analysis: 15 seconds
· Full IFA processing: 1 minute

Note: This is an average. Analysis on your assessment may take more or less time.

The IFA Server uses concurrent job threads to process each request. The number of concurrent jobs that can run is controlled by how many cores the host system has available, divided by two.

For example, if you multiply 60 complete IFA jobs per hour by the number of cores on a 24 core machine, on average the IFA server on this host can process 720 complete jobs per hour (60x(24/2)=720). If just one of the three options is used, such as only triaging the assessment, the IFA server on this host can perform 1,440 IFA jobs, 2880 fix group jobs, or 2880 delta jobs per hour.

Client tooling

 

 

  • The sample client tooling is available on Github at AppSecDev. The sample tooling performs the preparation, REST calls, and retrieval of the completed job through the REST APIs. More information is available on the AppScanDev Git page.


Running the IFA sample

 

 

This assessment can now be loaded into Source for Analysis or with the developer plugins currently available with AppScan Source.

Integrating IFA into the software development lifecycle

Each call in the IFA server can be used independently of the other calls. Any SAST assessment using some or all of the calls within the IFA server can be used for any assessment throughout its lifecycle.

For example, an assessment can be sent through the fix group call without previously applying IFA to it. Similarly, an assessment can be compared with a baseline prior to IFA and then sent through the IFA with the “new” findings.

Workflows
There are two primary workflows that can be modified to fit individual needs. These workflows are a guide to understanding how pieces of IFA best fit together; they are not the only way to use IFA.

The two primary workflows are as follows:

 

    • Initial onboarding: manual analysis of a raw initial assessment.
    • Continuous: understanding what is fixed and what is new for human review.


Initial onboarding

 

  • Initial onboarding is when a raw initial assessment for an application is analyzed manually. This initial analysis typically involves resolving lost sinks, adding new filters to rule out certain finding patterns, and bundling the remaining findings into groups from which a developer can remediate the results.

    IFA takes most of the manual effort out of the process. User identification of filters can be augmented and in many cases replaced using IFA. The bundling of findings into like groups can be augmented or replaced through fix groups.

    Using IFA, the initial onboarding workflow may look something like this:
      • Initial raw assessment -> IFA -> IFA Assessment -> Fix Grouping -> Fix Group Assessment bundles -> fix group assessment defects

    This workflow involves two of the three actions available. The goal in this workflow is to get the initial set of findings for human review out as quickly as possible.


Continuous

 

 

  • Continuous remediation is understanding what is fixed (resolved queue) and what is new (new findings queue) for human review. IFA addresses these requirements.

 

  • Resolved queue
    • A single fix group can be used as a baseline for delta analysis to determine whether all or only some of the findings in a fix group have been addressed in the latest scan. Further, the initial IFA assessment can also be compared against a new IFA assessment to get a raw list of findings yet to be resolved.

      There are a variety of ways these operations can fit together, including the following:
      • IFA Assessment -> Fix Grouping -> Fix Group Assessment becomes a defect and baseline
      • Fix Group assessment and new scan assessment -> Delta Resolved Analysis -> Resolved assessment.

      The number of findings in the resolved queue describe how many of the findings in the fix group have been resolved. If this number = 0 then the fix group has been remediated. If the number > 0 then add the findings to the defect as “not yet resolved” to help developers focus on specific issues.

    New findings queue
    • New findings represent new potential attack vectors from code which was added or changed in the most recent assessment. Appropriately handling the results in this group involves additional planning. IFA demonstrates the power of fix groups when a finding count reaches critical mass. Although the critical mass threshold will vary by assessment, usually twenty or more findings in a single assessment will benefit from fix grouping. This lends to some potential behavior changes based on the results at each stage, for example:
      • Baseline IFA assessment and new scan IFA assessment -> Delta New Analysis -> New findings assessment
      • If finding count > 20 then new finding assessment -> Fix Grouping
        Else new finding assessment -> new defect


IFA data retention and server security

Data retention

 

  • The IFA server retains the result of the jobs submitted for a period of 30 days by default. Modify retention time through the retention_days setting in the file < UNZIP DIRECTORY>\wlp\usr\servers\ifa\appscan\config\ifa-server.apsettings. See the Configuration section.


Secure Socket Layer (SSL) security certificates

 

 


Recent Updates

getAttribute Prefilter

IFA 1.1.30 includes an adjustment to the getAttribute prefilter. The prefilter has been adjusted to be less aggressive and retain more findings of value


New prefilters

IFA 1.1 includes new prefilters across several languages for more robust and accurate categorization of scan findings and for better security and noise reduction in findings classification:

 

  • .NET:
    •  IFA3022: A new .NET prefilter for System.Web.Mvc.Html.InputExtensions to write was added to be categorized under uninteresting finding. As getting the input items on a webpage is normal operation, this finding is categorized under uninteresting. 

 

  • Java:
    • IFA2029: From security perspective, java.net.InetAddress sources are generally uninteresting. A new prefilter was generated and added to capture the InetAddressjava sources under uninteresting classification. 
    • IFA2030: Commonly used java.sql.Statement.executeBatch operations  are categorized under low value finding.
    • IFA2031: File source to a file sink is uninteresting from a security perspective; it is added to the prefilter as part of uninteresting finding category.
    • APS-356: If java.util.Locale.<init> appears in the taint list for a finding it is effectively sanitized. It is difficult to get a hack through a locale retrieval and can be considered noise; it was added to the prefilter list.


Issue fixes

Issue fixes in IFA 1.1 include the following:
 

  • APAR - CQPAR00224648: When running an IFA scan on an assessment an exception is thrown. 
  • APAR - CQPAR00218953: Visual Basic assessment throws an error while loading into AppScan Source for Analysis after running IFA on it.

[{"Product":{"code":"SSS9LM","label":"IBM Security AppScan Source"},"Business Unit":{"code":"BU008","label":"Security"},"Component":"Not Applicable","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"Version Independent","Edition":""}]

Document Information

Modified date:
20 November 2018

UID

swg22004133