Configuring WML for z/OS base for high performance

WML for z/OS uses z/OS Spark. Spark can consume a large amount of available resources on your z/OS system if it is not properly configured. It's recommended that you evaluate the performance of WML for z/OS and fine-tune your system settings accordingly.

Before you begin

Important: The performance data and recommendations contained in this document were determined in a controlled environment. Therefore, your results might vary significantly. No commitment as to your ability to obtain equivalent results is in any way intended or made by the information in this document.

Procedure

To improve the WML for z/OS performance, apply the following settings, configurations, and practices:

  • In the SPARK_CONF_DIR/spark-env.sh specify a limit for the number of cores that each application client can use.
    For example, the following environment variable specifies that each application uses two cores:
    SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=2"
  • In the log4j.properties file in the $IML_INSTALL_DIR/configuration/generated/ directory, set logging for the Liberty Profile server to WARN or less, to reduce the impact of logging on performance.
  • When you create models, use the following approaches to tune application performance:
    • In JDBC calls, return just the data that you need. That is, if you need only one percent of data, return only one percent. In the following example, the table contains six million rows. However, only one percent of the data is used for training, and the data returned is limited to one percent:
      val jdbcDF2 = sparkSession.read.format("jdbc").options(Map(
           "driver" -> "com.ibm.db2.jcc.DB2Driver",
           "url" -> "jdbc:db2://host-name",
           "user" -> "userID", "password" -> "password",
           "dbtable" -> "(SELECT * FROM qaulifier.table-name 
                           FETCH FIRST 60000 ROWS ONLY) as t",
           )).load()
    • When training the model, cache the DataFrame to reduce the number of calls to Db2®:
      var trainCached = trainDF.cache()
  • A good rule of thumb is to configure your executor heaps to enable 100 percent caching and avoid spillover to disk.
    When loading data from Db2, and storing it in DataFrames, the in-memory size is 1 - 2 times the flat data size. To determine the final executor heap size, use the following calculation:
    in memory data size / percentage of heap reserved for cache  = executor heap
    The default value for percentage of heap reserved for cache is 0.54. For example, if you have 1.5 GB of data to load, you can use the following calculation to find the final executor size:
    3 GB (1.5 GB * 2) / 0.54 (default) = 5.55 GB