Installing and configuring Apache Kafka on z/OS

Install and configure a stand-alone Apache Kafka on z/OS® to receive data.

Before you begin

To run Apache Kafka on z/OS, your z/OS operating system must meet the following requirements:

z/OS version 2.4 or later. With z/OS 2.4, the following fixes must be installed:
- OA60306/UJ90013
- OA60310/UJ05191
- OA60316/UJ05214
- PH32235/UI74844
IBM® 64-bit SDK for z/OS Java™ Technology Edition V8 (product number 5655-DGH). It must be at the minimum service level of SR6 FP20 (8.0.6.20). It is recommended to use the latest service release. To find the latest service release or fix pack, see IBM SDK for z/OS, Java Technology Edition.
Bash 4.3 or later. You can download Bash from the Rocket software website.
Apache Kafka version 2.6.0 or later. You can download the Kafka binaries from the Apache Kafka download page.
Note: There is an issue to run Kafka 3.0.0 and later on z/OS. Do not run Kafka 3.0.0 or later on z/OS.
A dedicated zFS file system is recommended. The zFS data set needs to be in extended format so that it can be allocated or grow beyond four gigabytes (4 GB) in size.

Procedure

In the following example, Apache Kafka is installed into a dedicate zFS data set, which is mounted at the /u/kafka z/OS UNIX System Services directory.

Update the following parameters in PARMLIB member BPXPRMxx.

Some of the parameters in PARMLIB member BPXPRMxx might affect the successful Java operation by imposing limits on resources that are required. See the suggested minimum values for these parameters in IBM Java on z/OS documentation: BPXPRM settings (z/OS only).

In addition, Apache Kafka requires larger values than recommended by the IBM Java on z/OS documentation for the following parameters.

MAXMMAPAREA
MAXSHAREPAGES
SHRLIBRGNSIZE

The following table lists the recommended BPXPRMxx settings for these parameters. Note that it is not a complete BPXPRMxx member and you need merge it into your own BPXPRMxx member.

/*********************************************************************/
/*   BPXPRM settings for Java/Kafka                                         */
/*********************************************************************/                                                                      
MAXPROCSYS(900)                              /* default 900                 */
MAXPROCUSER(512)                             /* default 25                  */
MAXUIDS(500)                                 /* default 200                 */
MAXTHREADS(10000)                            /* default 200                 */
MAXTHREADTASKS(5000)                         /* default 1000                */
MAXASSIZE(2147483647)                        /* default 209715200          */
MAXCPUTIME(2147483647)                       /* default 1000                */
MAXMMAPAREA(2097152)                         /* default 40960              */
MAXSHAREPAGES(2097152)                       /* default 131072             */
IPCSEMNIDS(500)                              /* default 500                */
IPCSEMNSEMS(1000)                            /* default 1000               */
SHRLIBRGNSIZE(134217728)                     /* default 67108864           */
SHRLIBMAXPAGES(4096)                         /* default 4096               */

After you update the PARMLIB member BPXPRMxx, either do an IPL or run the SET OMVS=xx command to update the parameters dynamically.

Download the Apache Kafka binary 2.6.0 or later from the Apache Kafka download page. In this example, the Kafka binary file kafka_2.13-2.6.0.tgz is used.

Regarding the name of the Apache Kafka binary, the first version number (2.13) is for the Apache Spark version that the Kafka package is built for. You should always get a kafka_2.13 release. The second version number (2.6.0) is for the Apache Kafka version itself.
Use FTP or other approach to transfer the Apache Kafka binary file to the z/OS UNIX System Services directory /u/kafka. Be sure to use the binary mode when you transfer the file.
Extract the Apache Kafka binary package.
- If gzip is available on your z/OS system, run the gzip command and the tar command.
  Tip: You can download gzip from the Rocket Software website.
```
cd /u/kafka
gzip -d kafka_2.13-2.6.0.tgz
tar -C /u/kafka -xovf kafka_2.13-2.6.0.tar
```
  With the above commands, the Apache Kafka binary will be extracted to the /u/kafka/kafka_2.13-2.6.0 directory.
- If gzip is not available on your z/OS system, you can extract the Apache Kafka binary .tgz file to a .tar file first on your distributed system, and then transfer the .tar file to the z/OS UNIX System Services directory and use the tar command to extract.
Tag the Apache Kafka shell scripts and properties files.
In the Apache Kafka binary, all the shell scripts and properties files are in ASCII encoding. The files cannot be processed in z/OS by default. The z/OS enhanced ASCII support provides the automatic conversion between ASCII and EBCDIC encoding. To use the automatic conversion, a file must be correctly tagged with its code set.

Use the following commands to tag the Kafka shells scripts and properties files as ASCII files.
```
cd /u/kafka/kafka_2.13-2.6.0/bin
chtag -tc ISO8859-1 *.sh

cd /u/kafka/kafka_2.13-2.6.0/config
chtag -tc ISO8859-1 *.properties
```
Important: Under an OMVS telnet or SSH connection to the z/OS, you might encounter that the shell scripts or properties files cannot display correctly when using the vi command. The automatic conversion cannot work if the AUTOCVT is not enabled at the system wide and the _BPXK_AUTOCVT environment variable is also not set. If you encounter this problem, run the following command to dynamically turn it on:
```
export _BPXK_AUTOCVT=ON
```
Update the ZooKeeper properties file.
ZooKeeper is a distributed, open source coordination service and is used to maintain naming and configuration data and to provide flexible and robust synchronization within distributed systems. Zookeeper keeps track of status of the Kafka cluster nodes, Kafka topics, partitions and others. For more information about ZooKeeper, see the Apache ZooKeeper website.
To run the ZooKeeper on z/OS, you need to update the default ZooKeeper properties file zookeeper.properties. The ZooKeeper properties file is located in the /u/kafka/kafka_2.13-2.6.0/config directory. Make the minimum updates as shown below.
```
dataDir=/u/kafka/kafka_2.13-2.6.0/logs/zookeeper-logs
```
Update the Kafka server properties file.
To run the Kafka server on z/OS, you need to update the default Kafka server properties file server.properties. The Kafka server properties file is located in the /u/kafka/kafka_2.13-2.6.0/config directory. Make the minimum update as shown below:
```
listeners=PLAINTEXT://yourhostname:9092
log.dirs=/u/kafka/kafka_2.13-2.6.0/logs/kafka-logs 
```
Replace yourhostname with the fully qualified host name of z/OS where the Kafka server runs.
Optionally, you can add or update the following parameters in the server.properties file to save DASD space. In the example shown below, the Kafka server keeps only the data of the last hours, which helps to minimize the DASD space when doing tests.
```
log.cleaner.delete.retention.ms=60000         Default: 86400000
log.retention.hours=2                         Default: 168
log.retention.bytes=2147483648                Default: -1
log.retention.check.interval.ms=60000         Default: 300000
```
You must evaluate the values for the parameters to suit for your requirement when the Kafka server is put into production.

Create SYS1.PROCLIB members for starting and stopping ZooKeeper and Kafka server.

Though the ZooKeeper and Kafka server can be started directly under z/OS UNIX System Services via the shell scripts, it is much easier to start as a started task and stop via MVS™ commands. Add the following started task JCLs to your SYS1.PROCLIB data set.

Note: The following sample started task JCLs assume that:

The bash is installed in /hsstools/bash-4.3/.
The Java is installed in /usr/lpp/java/J8.0_64.

Create SYS1.PROCLIB (ZKEESTRT) to start ZooKeeper:

//ZKEESTRT PROC                                                   
//ZCON     EXEC PGM=BPXBATSL,REGION=0M,TIME=NOLIMIT
//STDOUT   DD   SYSOUT=*                                          
//STDERR   DD   SYSOUT=*                                          
//STDIN    DD   DUMMY                                             
//STDPARM  DD   *                                                 
PGM /hsstools/bash-4.3/bin/bash                                             
/u/kafka/kafka_2.13-2.6.0/bin/zookeeper-server-start.sh 
/u/kafka/kafka_2.13-2.6.0/config/zookeeper.properties
//STDENV   DD   *                                                 
_BPX_SPAWN_SCRIPT=YES               
_BPX_SHAREAS=YES                    
_BPXK_AUTOCVT=ON                    
JAVA_HOME=/usr/lpp/java/J8.0_64     
TZ=EST5EDT 
//*
//         PEND

Create SYS1.PROCLIB (ZKEESTOP) to stop ZooKeeper.

//ZKEESTOP PROC                                      
//ZCON     EXEC PGM=BPXBATSL,REGION=0M  
//STDOUT   DD   SYSOUT=*                             
//STDERR   DD   SYSOUT=*                             
//STDIN    DD   DUMMY                                                               
//STDPARM  DD   *                                    
PGM /bin/sh                                          
/u/kafka/kafka_2.13-2.6.0/bin/zookeeper-server-stop.sh       
//STDENV   DD   *                                    
_BPX_SPAWN_SCRIPT=YES                                
_BPX_SHAREAS=YES     
_BPXK_AUTOCVT=ON                                                       
JAVA_HOME=/usr/lpp/java/J8.0_64 
JOBNAME=ZKEESTRT                     
//*                                                  
//         PEND

The JOBNAME parameter specifies the SYS1.PROCLIB member name of the ZooKeeper started task JCL. If you name it in a different name, you must update this parameter .

Create SYS1.PROCLIB (KAFKSTRT) to start the Kafka server.

//KAFKSTRT PROC                                                   
//KFK      EXEC PGM=BPXBATSL,REGION=0M,TIME=NOLIMIT,MEMLIMIT=4G   
//STDOUT   DD   SYSOUT=*                                          
//STDERR   DD   SYSOUT=*                                          
//STDIN    DD   DUMMY                                                                                                                            
//STDPARM  DD   *                                                 
PGM /hsstools/bash-4.3/bin/bash                                             
/u/kafka/kafka_2.13-2.6.0/bin/kafka-server-start.sh
/u/kafka/kafka_2.13-2.6.0/config/server.properties
//STDENV   DD   *                                                 
_BPX_SPAWN_SCRIPT=YES                    
_BPX_SHAREAS=YES
_BPXK_AUTOCVT=ON                         
_EDC_AUTO_MAP64=YES                         
JAVA_HOME=/usr/lpp/java/J8.0_64        
TZ= EST5EDT                                          
//*
//         PEND

Create SYS1.PROCLIB (KAFKSTOP) to stop the Kafka server.

//KAFKSTOP PROC                                      
//KFK      EXEC PGM=BPXBATSL,REGION=0M  
//STDOUT   DD   SYSOUT=*                             
//STDERR   DD   SYSOUT=*                             
//STDIN    DD   DUMMY                                                               
//STDPARM  DD   *                                    
PGM /bin/sh                                          
/u/kafka/kafka_2.13-2.6.0/bin/kafka-server-stop.sh       
//STDENV   DD   *                                    
_BPX_SPAWN_SCRIPT=YES                                
_BPX_SHAREAS=YES       
_BPXK_AUTOCVT=ON                                    
JAVA_HOME=/usr/lpp/java/J8.0_64    
JOBNAME=KAFKSTRT                  
//*                                                  
//         PEND

The JOBNAME parameter specifies the SYS1.PROCLIB member name of the Kafka server started task JCL. If you name it in a different name, you must update this parameter accordingly.

Add a new rule in SYS1.PARMLIB(SMFLIMxx) to set the maximum of shared pages above the bar that the Kafka server can use. If there are not enough shared pages, the Kafka sever might fail with Java error OutOfMemoryError. In the following example, the value of MAXSHARE for the Kafka server started task KAFKSTRT is set to 500,000 pages (2G).
```
REGION JOBNAME(KAFKSTRT) MAXSHARE(500000)
```
If the Kafka server has a large number of topics or partitions, increase the value of MAXSHARE.
After you update the PARMLIB member SMFLIMxx, either do an IPL or issue the SET SMFLIM=xx command to update the parameters dynamically.
Start the ZooKeeper and Kafka server on z/OS. See Operating Kafka on z/OS for more instructions.

Results

A stand-alone Apache Kafka is set up on z/OS.

What to do next

When the basic Apache Kafka stand-alone server runs normally on z/OS, you can do additional configuration as documented in the Apache Kafka website. For example, you might configure a Kafka Cluster with Kafka brokers running on several LPARs in a Parallel Sysplex®, or you can configure Transport Layer Security (TLS) or other settings.

Important: The Apache Kafka provided producer (kafka-console-producer.sh) and consumer (kafka-console-consumer.sh) console commands don’t work under z/OS UNIX System Services. You can run these commands on a distributed platform but connect to the Kafka server running on z/OS to verify whether the Apache Kafka on z/OS is working correctly. You can also test with your own Java programs.