Troubleshooting IBM MFA

The troubleshooting steps you perform depend on which system has caused the error.

General authentication failure troubleshooting tips

If you are unable to successfully authenticate with one or more authentication factors, start your trouble shooting with these general purpose tips:
  1. For in-band authentication, turn off compound authentication for the authentication factor. For IBM® MFA Out-of-Band authentication, set only one authentication factor to be active at a time. This step simplifies the authentication flow and reduces the possible points of failure.
  2. If all authentication factors fail, ensure that both of the IBM MFA started tasks are started.
    Important: Start the IBM MFA started tasks after TCP/IP, PAGENT (for AT-TLS, if needed), and ICSF (if needed) have started successfully and all TCP/IP-related services such as the resolver are running and fully initialized. See IBM MFA configuration roadmap for the factor-specific configuration requirements.
    • To verify that the IBM MFA services started task started, check the SYSLOG for errors. The absence of errors after the "AZF2110I Started console receiver" message in the SYSLOG indicates success.
    • To verify that the IBM MFA web services started tasks started, check the SYSLOG for errors:
      20190822122051.560549 AZFWEB:AZF6002I Server base init success 
      (sts=0, rc=0, rsn=0x0)                
      20190822122051.560755 AZFWEB:AZF6050I Console listener task starting up                              
      20190822122052.563303 AZFWEB:AZF6012I IBM Multi-Factor Authentication 
      Web Services 
             startup complete  
  3. The AZF#IN00 started task can fail to start with a return code of 8 or 16. A return code of 8 indicates that AZFSTCMN is not running in Key 2, as described in Update SCHEDxx PARMLIB program properties. A return code of 16 means that there is another instance of AZFSTCMN running on this LPAR and the program call linkage cannot be created. If this is not the case, take a full system dump and submit to IBM.
  4. Check the authentication factor ISPF panels for typos or missing fields.
  5. Ensure that the PKCS#11 token name specified for the authentication factor exists and is correct.
  6. Check the SYSLOG to verify that the authentication factors you configured started without errors. It is expected that any authentication factors that you did not configure will show notifications in the SYSLOG.
    Consider the following sample successful SYSLOG entries for AZFSIDP1 and AZFTOTP1:
    20190828151037.196497 PLUGHOST:AZF2102I Loaded authenticator 
    (name: AZFSIDP1, entry point: 0x137C9098, status: 0x0)                 
    20190828151037.196977 PLUGS:AZF2108I Authenticator entry point invoked: 
      status =0x0                                               
    20190828151037.197751 PLUGHOST:Successfully retrieved system factor data:                                                           
    20190828151037.199566 AZFSIDP:AZF3021I: AZFSIDP1 Initializing....
    20190828151037.203474 AZFSIDP:Incoming settings blob length: 213                                                                    
    20190828151037.204742 AZFSIDP:AZF3054I: AZFSIDP1 Settings follow:                                                                   
    20190828151037.204773 AZFSIDP: Authenticator settings:                                                                              
    20190828151037.204791 AZFSIDP:  Initial trace level: 1                                                                              
    20190828151037.204807 AZFSIDP:  Compound mode:       Enabled                                                                        
    20190828151037.204824 AZFSIDP:  Compound separator:  *  
    20190828151037.204841 AZFSIDP:  Compound order:      Password first                                  
    20190828151037.204859 AZFSIDP:  SDCONF path:         PATH.AZF.SDCONF.REC                          
    20190828151037.204876 AZFSIDP:  Node Secret path:    PATH.AZF.NODESCRT                            
    20190828151037.204893 AZFSIDP:  SDOPTS path:         PATH.AZF.SDOPTS.REC                          
    20190828151037.227594 PLUGS:AZF2109I Authenticator initialized : 
    entry 0x137C9098, name AZFSIDP1 (strong)
    :
    20190828151037.229049 PLUGHOST:about to load AZFTOTP1                                                                              
    20190828151037.229765 PLUGHOST:AZF2102I Loaded authenticator (name: AZFTOTP1, 
      entry point: 0x138C1870, status: 0x0)  20190828151037.233076 
                 PLUGS:AZF2108I Authenticator entry point invoked : 
      status = 0x0                                              
    20190828151037.233189 PLUGHOST:Successfully retrieved system factor data:                                                          
    20190828151037.235061 AZFTOTP:AZF4126I AZFTOTP1 settings follow:                                                                   
    20190828151037.235108 AZFTOTP: Authenticator settings:                                                                             
    20190828151037.235131 AZFTOTP:  Initial trace level: 3                                                                             
    20190828151037.235153 AZFTOTP:  Compound mode:       Enabled                                                                       
    20190828151037.235172 AZFTOTP:  Compound order:      Password First                                                                
    20190828151037.235192 AZFTOTP:  Compound separator:  :                                                                             
    20190828151037.235212 AZFTOTP:  Default ALG:         SHA512                                                                        
    20190828151037.235231 AZFTOTP:  Default NUMDIGITS:   8                                                                             
    20190828151037.235247 AZFTOTP:  Default PERIOD:      30                                                                            
    20190828151037.235264 AZFTOTP:  Default WINDOW:      10                                                                            
    20190828151037.235281 AZFTOTP: Registration services settings:                                                                     
    20190828151037.235303 AZFTOTP:  Initial trace level: 3                                                                             
    20190828151037.235328 AZFTOTP:  Realm name:          RS13TOTP                                                                      
    20190828151037.236509 AZFTOTP:AZF4001I AZFTOTP1 Authenticator init                                                                 
    20190828151037.236574 PLUGS:AZF2109I Authenticator initialized : 
    entry 0x138C1870, name AZFTOTP1 (strong)                          
    20190828151037.236971 PLUGHOST:about to load AZFPTKT1                                                                              
    
  7. Check the SYSLOG for obvious authentication errors. In the following example, the user was denied access by the AZFLDAP1 authentication factor, possibly because of an incorrect LDAP password:
    20190829112522.782214 STCMAIN:AZF2227I User USERB denied access 
    in-band by factor AZFLDAP1 
  8. Turn on a higher level of component tracing, as described in Modifying component trace levels. You can turn on tracing on a per-component basis, and independently for each of the started tasks. Lower the trace level to 0 or 1 after the problem has been reproduced and the data has been collected.

Troubleshooting RSA SecurID and RADIUS

If the entry in the SYSLOG indicates that an authentication is denied by RSA SecurID or any of the RADIUS authentication factors, start your trouble shooting with the following steps:
  1. If you changed the PKCS#11 token name or label for any of the RADIUS factors, ensure that you also re-entered the existing shared secret on the ISPF panel.
  2. Ensure that there is connectivity between the RSA SecurID or RADIUS server and the IBM MFA system. You should be able to ping the RSA SecurID or RADIUS server from the IBM MFA system .
  3. Check the SYSLOG for connection errors to the RSA SecurID or RADIUS server. In the following example, there was a typo in the RADIUS server name and the hostname cannot be resolved:
    20190829162219.118889 RADPBASE:AZF9215E Failed to resolve 
              hostname entry: serVver.company.com     
    20190829162219.118907 RADPBASE:AZF9207E Failed to init RADIUS server 
              entry (primary, sts=68306)               
    20190829162219.118923 RADPBASE:AZF9207E Failed to init RADIUS server 
              entry (no valid servers specified)       
    20190829162219.118941 AZFSFNP:AZF9130E RADIUS initialization failed 
    (sts=68321, p11rc=0, p11rsn=0x0)
  4. Verify that the RSA SecurID or RADIUS server accepts communications from each z/OS® system or LPAR that is running the IBM MFA services started task.
  5. Check the RSA SecurID or RADIUS server authentication log to see if the authentication was successful or why it was denied
  6. Check the status of the RSA SecurID or RADIUS token or the user PIN for an account that is generating an error. It is possible that a token is inactive, that a user PIN has expired, and so forth.
  7. If you made configuration changes to the RSA SecurID AZFSIDP1 authentication factor and authentications no longer succeed, clear the node secret from each IBM MFA client host and retry.
  8. RSA SecurID disaster recovery steps are described in Disaster recovery for IBM MFA with SecurID.

Troubleshooting IBM MFA Certificate Authentication

If the user receives an "There was an error connecting to the server." error when attempting to log in with Certificate Authentication, ensure that Enable out of band services and Enable certificate authentication are both enabled, as described in Configure IBM MFA web services started task.

Troubleshooting TOTP and generic TOTP

Begin your TOTP trouble shooting with the following steps:
  1. For TOTP, if your web services server certificate was not issued by a well-known CA, do not instruct users to visit the web services server start page until they have a Configuration Profile installed that allows them to establish TLS connections with the web services server. If users accept the web services server certificate in Mobile Safari as an SSL exception, the IBM TouchToken for iOS application still cannot trust the CA that issued the certificate. Users will be able to view the enrollment launch URL, but will not be able to complete enrollment.
    Note: The iOS operating system has certificate requirements that are not always satisfied by self-signed certificates. If you are attempting to use a self-signed certificate with the IBM TouchToken for iOS application and cannot successfully authenticate, it may be because iOS does not accept the certificate. This is true even if you successfully create a Configuration Profile for the self-signed certificate.
  2. If the user receives an "There was an error connecting to the server." error when attempting to log in, ensure that Enable out of band services and Enable TOTP services are both enabled, as described in Configure IBM MFA web services started task.
  3. If the user is unable to enroll their device for generic TOTP, the most likely cause is that the user forgot to enter the displayed TOTP code on the web page and click Generic TOTP Enrollment.
    Ensure that the user performs these steps:
    1. Instruct the user to open the generic TOTP start page in a desktop web browser and log in with their z/OS user name and password: https://hostname:6789/AZFTOTP1/genericStart

      A page that contains the AuthURL and the encoded QR code is displayed.

    2. Instruct the user to point their device at the generated QR code and scan it with the application. The application displays the TOTP code.
    3. Instruct the user to enter this TOTP code on the web page and click Generic TOTP Enrollment. The user may have to scroll to see this control, depending on the size of their browser window.

Troubleshooting AT-TLS

Be aware that the Policy Agent (PAGENT) task often uses a separate log stream for task-initialization messages (including success or failure when interpreting and loading TTLSRule definitions) than the location used for TCP socket behavior messages. (Socket behavior messages include confirmation that rules are firing when expected, and details of the TLS negotiation process for specific peer connections).

The default location for PAGENT task-level log messages is /tmp/pagent.log. You can change this location with the -L parameter passed to the PAGENT program in the PAGENT job JCL. Some installations route these messages to the z/OS UNIX syslog.

Socket behavior log messages are always written to the z/OS UNIX syslog. If syslogd is not running, or is not configured to route TCP daemon messages to an alternative file location, they will go to the operator log. If you do not see AT-TLS messages in the operator log, inspect the syslogd configuration (usually /etc/syslog.conf) and look for a directive that routes TCPIP daemon messages to a specific file. These logs often contain data sufficient for diagnosing TLS negotiation errors, or at least pointing to the next indicated step toward success without proceeding to a packet trace. IBM recommends using syslogd to collect these messages in a separate z/OS UNIX file because this makes them easier to consume and make available for support.

The user receives an "Error processing MFA request" error

There are several possible causes of this error:
  • The authentication methods configured for the user must match the policy. The policy is not satisfiable if the user is not configured for all of the authentication methods required by the policy.
  • No preceding or trailing spaces must exist in the IBM MFA configuration. For example, if an extraneous space exists in the Radius Primary Server field, IBM MFA will not be able to resolve the host name or IP address.

Browser shows incorrect or stale data

If your web browser shows incorrect or stale data, refresh the browser window. The browser cache might be out-of-sync with the IBM MFA server.

Helpful information to provide when requesting support

When requesting support or opening a PMR for IBM MFA, it is most helpful if you:
  • Provide a detailed problem description.
  • Turn on the highest level of component tracing, as described in Modifying component trace levels. You can turn on tracing on a per-component basis, and independently for each of the started tasks. Lower the trace level to 0 or 1 after the problem has been reproduced and the data has been collected.
  • Provide the SYSLOG for both started tasks.
  • Indicate which external security manager (for example, RACF®) is being used.
  • Provide the system dump (SVC dump), if one was generated by a failure.
  • Provide details on the type and version of the external authentication server, if there is a communication or authentication issue involving it.
  • Indicate which client browser and OS are being used, and their levels, if the problem involves a problem with IBM MFA Out-of-Band authentication or registration.
  • Provide the browser HTTP ARchive format (HAR) file data for the failing scenario if the problem involves a problem with IBM MFA Out-of-Band authentication or registration and can be reproduced. The steps to produce a HAR file are browser specific.
  • For RADIUS authentication, provide the RADIUS server authentication log, if available.
  • For IBM Verify Gateway for RADIUS, also provide both the Windows Event log and the trace-file specified in the IbmRadiusConfig.json file.