IBM Support

IBM i GSKit/SSL_ API GSK_WOULD_BLOCK/EWOULDBLOCK Advisory Condition After Applying 7.3 TR8 and with 7.4 OS

Troubleshooting


Problem

Adding support for TLSv1.3 altered the code path in System TLS, resulting in latency and response time improvements for some operations.  This has exposed secure non-blocking sockets applications that do not handle a blocking condition.  Based on timing, some invocations of the gsk_secure_soc_read() or SSL_Read() API reach a block condition before the remote data is available from the wire.  This results in GSK_WOULD_BLOCK returned for gsk_secure_soc_read() or SSL_ERROR_IO with errno set to EWOULDBLOCK for the SSL_Read() API.  These advisory return codes notify the application to retry the read operation again.  However, some applications treat the return codes as fatal.

Symptom

gsk_secure_soc_read() or SSL_Read() API completes and no data is returned.  A GSK_WOULD_BLOCK advisory condition return code is returned on the gsk_secure_soc_read() API or a SSL_ERROR_IO return code with errno set to advisory condition EWOULDBLOCK on the SSL_Read() API advising the client application to wait and then retry the read operation again.

Cause

Upgraded to IBM i 7.4

OR
 
The following PTFs were installed for IBM i 7.3 OS
  • SF99730 level 20128 which includes Technology Refresh 8 (MF99208)
  • SF99867: 730 TCP/IP PTF Group Level: 4
  • SF99722: 730 IBM HTTP Server for i PTF Group Level: 24
    For GUI System Value QSSLPCL and QSSLCSL support, not for HTTP Server use of TLS 1.3
  • SF99725: 730 Java PTF Group Level: 17
    Plus these 4 Java PTFs:
    SI72654 and SI72653 - JVA-RUN JDK 80-64 Native JSSE TLSv1.3
    SI72652 and SI72651 - JVA-RUN JDK 70-64 Native JSSE TLSv1.2 ChaCha20Poly1305

Environment

IBM i 7.4
IBM i 7.3 TR8 and later
 

Diagnosing The Problem

TLS reads will fail during the read operation resulting in a TLS read error.  The gsk_secure_soc_read() API will return a GSK_WOULD_BLOCK advisory condition return code.  The SSL_Read() API will return a SSL_ERROR_IO return code with errno set to advisory condition EWOULDBLOCK.

Resolving The Problem

***NOTE:  IBM does not consider this issue to be an IBM i OS defect.  The GSKit and SSL_ API changes expose programming issues in the client application that would also cause failures for any latency delay in the network.  The client application will need to be modified to handle the GSK_WOULD_BLOCK or EWOULDBLOCK advisory condition to resolve this issue.***
GSKit APIs
 
To resolve this issue, IBM is advising you switch your GSKit client application to implement retry logic or use blocking sockets and implement the GSK_IBMI_READ_TIMEOUT attribute to the number of seconds you wish to block for before timing out the secure connection. This would be set by the gsk_attribute_set_numeric_value() API after the gsk_environment_open() API is called. You would set this attribute along with any other GSKit attributes being set before the gsk_environment_init() API is called.

If you are not familiar with blocking/non-blocking mode, here is some information on this.
 
Nonblocking I/O
The key information for application developers making a code change is to remove the non-blocking configuration:  
Sockets are set to be non-blocking by either calling fcntl() to turn on the O_NONBLOCK flag, or calling ioctl() to turn on the FIONBIO flag. The default for a socket connection is blocking mode so if these calls are removed, the socket will go back to blocking mode.

https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_73/rzab6/cnonblock.htm
If you wish to continue to use non-blocking I/O, you will need to modify the client application to check for the GSK_WOULD_BLOCK advisory return code.  If this occurs, the client application will need to be configure to wait or sleep for a short time and then implement a retry mechanism to retry the gsk_secure_soc_read() API again until all of the data arrives over the wire or the application stops trying and decides to fail.
===================================================================
Typical GSKit API call flow:
 
An application that uses the sockets and GSKit APIs contains the following elements:
  1. A call to socket() to obtain a socket descriptor.
  2. A call to gsk_environment_open() to obtain a handle to a secure environment.
  3. One or more calls to gsk_attribute_set_xxxxx() to set attributes of the secure environment. At a minimum, either a call to gsk_attribute_set_buffer() to set the GSK_OS400_APPLICATION_ID value or to set the GSK_KEYRING_FILE value. Only one of these should be set. It is preferred that you use the GSK_OS400_APPLICATION_ID value. Also ensure that you set the type of application (client or server), GSK_SESSION_TYPE, using gsk_attribute_set_enum().
    ****This is where the call to gsk_attribute_set_numeric_value() is made to set the value for GSK_IBMI_READ_TIMEOUT. ****
  4. A call to gsk_environment_init() to initialize this environment for SSL/TLS processing and to establish the SSL/TLS security information for all secure sessions that run using this environment.
  5. Socket calls to activate a connection. It calls connect() to activate a connection for a client program, or it calls bind(), listen(), and accept() to enable a server to accept incoming connection requests.
  6. A call to gsk_secure_soc_open() to obtain a handle to a secure session.
  7. One or more calls to gsk_attribute_set_xxxxx() to set attributes of the secure session. At a minimum, a call to gsk_attribute_set_numeric_value() to associate a specific socket with this secure session.
  8. A call to gsk_secure_soc_init() to initiate the SSL/TLS handshake negotiation of the cryptographic parameters.
  9. Calls to gsk_secure_soc_read() and gsk_secure_soc_write() to receive and send data.
    ***This is where the client application would need to handle the [GSK_IBMI_ERROR_TIMED_OUT] return code if the secure connection read operation times out.***
  10. A call to gsk_secure_soc_close() to end the secure session.
  11. A call to gsk_environment_close() to close the secure environment.
  12. A call to close() to destroy the connected socket.
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_73/rzab6/cgskit.htm
===================================================================

When using blocking sockets, IBM advises you set the GSK_IBMI_READ_TIMEOUT attribute to the number of seconds you wish to block for before timing out the secure connection. The standard recommended value for this attribute is 5 seconds, but may have to be adjusted based on the average and peak CGI program execution times on the back-end. If your peak CGI request processing time is 7 seconds, you might want to set your timeout to be 10 seconds instead of 5 seconds.

When implementing blocking sockets as well as the GSK_IBMI_READ_TIMEOUT attribute, the client application would also have to be coded to support the [GSK_IBMi_ERROR_TIMED_OUT] return code on the gsk_secure_soc_read() API call. If the read operation does time out, this event will need to be handled by the program. The program can either retry the read operation or throw an error when you believe you have waited long enough for the peer to send something , but it appears they never will.

Return Value
gsk_secure_soc_read() returns an integer. Possible values are:
[GSK_IBMI_ERROR_TIMED_OUT]
The value specified for the receive timeout expired before the read completed.
 


SSL_ APIs
IBM recommends clients modify their application to check for the SSL_ERROR_IO return code with errno set to advisory condition EWOULDBOCK on the SSL_Read() API call. If this advisory condition is returned by the SSL_Read() API, implement a wait or sleep for a short time and then retry the call to SSL_Read() API to see if the data has arrived and is now available to read.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"Platform":[{"code":"PF012","label":"IBM i"}],"Version":"7.1.0"}]

Document Information

Modified date:
30 June 2020

UID

ibm16237392