Installing IBM MDM Publisher on internet-connected Minikube (for trial or development environments only)

To deploy IBM® MDM Publisher in a trial or development environment using Minikube, you must install Minikube and configure the MDM Publisher deployment.

Before you begin

Note: These instructions are for installing MDM Publisher in an internet-connected environment. If you intend to install in an offline environment, see Installing IBM MDM Publisher on offline Minikube (for trial or development environments only).
Before you begin installing MDM Publisher in an online Minikube environment:

About this task

Note: For production environments, you must deploy MDM Publisher to a Kubernetes cluster. Minikube deployments only support development or trial use. For information about installing MDM Publisher on Kubernetes, see Installing IBM MDM Publisher in an internet-connected Kubernetes cluster.

MDM Publisher installation and deployment is done using a Helm chart. The MDM Publisher Helm chart is wrapped into an installation bin binary. You can either install the MDM Publisher Helm chart by running the scripts included in the installation bin or using unattended mode that leverages direct Helm commands.

The MDM Publisher distribution assets come with an installation file called publisher-helm-installer.bin. When you run the file, it creates a directory called mdm-publisher. This directory contains scripts and artifacts required to set up MDM Publisher. The file also provides you with information about using the artifacts to set up and configure your MDM Publisher instance.

Procedure

  1. On a computer connected to the internet, run publisher-helm-installer.bin.
    ./publisher-helm-installer.bin

    Confirm that the script created a directory called mdm-publisher.

  2. Depending on the amount of data you are intending to bulk load using MDM Publisher, you might need to adjust the amount of CPU and memory allocated to it by Minikube. The default allocations are small (8 executors with 1280 MB of memory) and must be adjusted for larger workloads. To adjust the resource allocations:
    1. Open ${INSTALL_LOC}/mdm-publisher/ibm-publisher-services-prod/values-minikube.yaml.
    2. Update the resource allocations as required for your deployment. For more information about configuration and workload sizes, see Configuring IBM MDM Publisher.
    3. Specify the number of Spark Executors for running MDM Publisher jobs. Edit the following properties in the YAML file:
      spark:
      ........  
        sparktransform:
          executor:
            instances: "4"  
          shufflePartitions: "50"
          memoryOverheadFactor: "0.1"
          driver:
            memory: "2g"
          mem: "1024m"
          limit:
            cores: "1"
        sparkextract:
          largetable:
            executor:
              instances: "4"
          smalltable:
            executor:
              instances: "1"        
          memoryOverheadFactor: "0.1"
          driver:
            memory: "2g"
          mem: "1024m"
          limit:
            cores: "1"
        graphBatchCommitSize: 100 # Size of a single commit to graph in a spark job
      Tip: The number of executor pods can be different for each MDM Publisher job stage (extract and transform).
  3. If you intend to use this MDM Publisher instance to connect to the Master Data Management service on IBM Cloud® or IBM Cloud Pak® for Data as a Service, edit the configuration to enable the connection. For more information, see Connecting MDM Publisher to the IBM Match 360 service on Cloud Pak for Data as a Service.
  4. Copy and run the following sample script to quickly install and configure a Minikube development environment, including and all of its dependency packages.
    Note: For more information about installing Minikube, see the instructions documented on the Kubernetes site: https://kubernetes.io/docs/tasks/tools/install-minikube/.
    #!/bin/sh
    
    #install docker
    #install, enable and start docker
    sudo yum install -y yum-utils device-mapper-persistent-data lvm2
    sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
    sudo yum install -y docker-ce docker-ce-cli containerd.io
    sudo systemctl enable docker
    sudo systemctl start docker
    sudo systemctl disable firewalld
    
    sudo setenforce 0
    sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
    sudo swapoff -a
    
    #install kubectl
    cat <<EOF > /etc/yum.repos.d/kubernetes.repo
    [kubernetes]
    name=Kubernetes
    baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
    EOF
    yum install -y kubelet-1.15.4 kubectl-1.15.4
    
    # set up network config
    sudo modprobe br_netfilter
    sudo cat <<EOF >  /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    EOF
    sudo sysctl --system
    sudo sysctl net.bridge.bridge-nf-call-iptables=1
    sudo sysctl net.ipv4.ip_forward=1
    
    #install socat (For example, on a Red Hat Linux system, you can install socat using the command sudo yum install socat -y)
    sudo yum install socat -y
    
    #install Java
    yum install -y java-1.8.0-openjdk
    
    #install Minikube. Detailed installation instructions are available on the Kubernetes site at https://kubernetes.io/docs/tasks/tools/install-minikube/
    curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube
    sudo install minikube /usr/local/bin
    
    #start minikube with no vm driver, running containers directly on local Docker
    minikube start --vm-driver=none --kubernetes-version v1.15.4
  5. Ensure that all of your secure endpoints are up and running.
    The MDM Publisher security setup wizard that you will run in the next step supports the following endpoints:
    • Master Data Connect:
      • Master Data Connect server
      • IBM Aspera® High-Speed Transfer Server (HSTS)
    • InfoSphere MDM:
      • Ongoing synchronization server (Apache Kafka)
      • Database server (Db2®, Db2 for z/OS®, or Oracle)
      • For virtual MDM deployments, the MDM application server (WebSphere® Application Server)
  6. Initialize the MDM Publisher installation by running the following script:
    ${INSTALL_LOC}/mdm-publisher/bin/init_publisher.sh
    The initialization script includes a number of startup actions, some of which require your input:
    • Starts a security setup wizard. Use the wizard to facilitate the configuration of secure SSL communication between MDM Publisher and other systems such as InfoSphere MDM, Master Data Connect, and their underlying systems. The wizard prompts you for parameters, imports server certificates into corresponding MDM Publisher trust stores, and creates necessary artifacts to facilitate secure communication. Certificates are extracted and placed into the cert_management folder.
      Note: The Master Data Connect installation on Minikube does not have security enabled on instances of Apache Cassandra and Elasticsearch. As a result, when running the init_publisher.sh script in a Minikube environment, skip the steps to set up a new Cassandra and Elasticsearch certificates. These instances are not secured, and will not provide a valid certificate. Specifically, answer N or n to the following prompts:
      Would you like to authorize a new Cassandra endpoint? (Yy/Nn)
      and
      Would you like to authorize a new elasticsearch endpoint? (Yy/Nn)
    • Downloads and installs the MDM Publisher image.
    • Initializes the MDM Publisher container.
    Important: Do not try to access MDM Publisher container until it is in a READY state. It can take several minutes to for MDM Publisher to successfully initialize. The first initialization will take longer than subsequent initializations.
  7. Secure a connection between MDM Publisher and a Master Data Connect instance.
  8. To modify the MDM Publisher configuration for an existing MDM Publisher deployment, complete the following steps.
    1. Update the appropriate configuration map YAML file with your configuration changes:
      • ${INSTALL_LOC}/mdm-publisher/ibm-publisher-services-prod/templates/publish-config-minikube.yaml
      • ${INSTALL_LOC}/mdm-publisher/ibm-publisher-services-prod/templates/publisher-wlp-configmap.yaml
      • ${INSTALL_LOC}/mdm-publisher/ibm-publisher-services-prod/templates/data-transfer-configuration.yaml
    2. Run the following script to apply the new configuration:
      ${INSTALL_LOC}/mdm-publisher/bin/update_configuration.sh
    Tip: As an alternate method of updating the MDM Publisher configuration, you can use a silent (headless) mode. This method could be useful if you frequently need to update a large number of endpoints.
    1. Record a silent mode response file by running the following command.
      helm get values ibm-publisher-services-prod-369994195 --namespace mdm-publisher > /root/install_bin/values-headless.yaml
    2. Edit the values file /root/install_bin/values-headless.yaml to provide the updated information for each endpoint in a block of code. For example:
      global:
        authorizedEndpoints:
          - type: MDC
            host: jujitsu1.example.ibm.com
            alias: mdc444a
            port: 30299
            aspera_host: jujitsu1.example.ibm.com
            aspera_alias: asp444a
            aspera_port: 31000
          - type: MDM_jetty
            host: dockermdm1.example.ibm.com
            alias: jet444a
            port: 4070
          - type: MDM_server
            host: rajumdm1.example.ibm.com
            alias: mdm444a
            port: 9443
          - type: MDM_jetty
            host: rajumdm1.example.ibm.com
            alias: jet444a2
            port: 4070
    3. Add a comment to the first line of the /root/install_bin/values-headless.yaml file that says USER-SUPPLIED VALUES:
    4. Run the following two commands to apply the changes to your existing MDM Publisher deployment. Replace the example values with values relevant to your deployment.
      helm upgrade --namespace mdm-publisher --reuse-values --values /root/install_bin/values-headless.yaml ibm-publisher-services-prod-369994195 ./mdm-publisher/ibm-publisher-services-prod
      kubectl -n mdm-publisher delete pod mdm-publisher-0 mdm-publisher-aspera-client-sts-0

What to do next

Now that you have installed and deployed MDM Publisher on Minikube, you might want to take the following actions: