To deploy IBM® MDM Publisher
in a trial or development environment using Minikube, you must install Minikube and configure the
MDM Publisher
deployment.
Before you begin
Before you begin installing
MDM Publisher
in an online Minikube environment:
About this task
MDM Publisher
installation and deployment is done using a Helm chart. The MDM Publisher
Helm chart is wrapped into an installation bin binary. You can either install the MDM Publisher
Helm chart by running the scripts included in the installation bin or using unattended mode that
leverages direct Helm commands.
The MDM Publisher
distribution assets come with an installation file called
publisher-helm-installer.bin
. When you run the file, it creates a directory called
mdm-publisher
. This directory contains scripts and artifacts required to set up MDM Publisher. The file also provides you with information about using the artifacts to set up and configure
your MDM Publisher
instance.
Procedure
- On a computer connected to the internet, run
publisher-helm-installer.bin
.
./publisher-helm-installer.bin
Confirm that the script created a directory called mdm-publisher
.
- Depending on the amount of data you are intending to bulk load using MDM Publisher, you might need to adjust the amount of CPU and memory allocated to it by Minikube. The default
allocations are small (8 executors with 1280 MB of memory) and must be adjusted for larger
workloads. To adjust the resource allocations:
- Open
${INSTALL_LOC}/mdm-publisher/ibm-publisher-services-prod/values-minikube.yaml.
- Update the resource allocations as required for your deployment. For more information
about configuration and workload sizes, see Configuring IBM MDM Publisher.
- Specify the number of Spark Executors for running MDM Publisher
jobs. Edit the following properties in the YAML file:
spark:
........
sparktransform:
executor:
instances: "4"
shufflePartitions: "50"
memoryOverheadFactor: "0.1"
driver:
memory: "2g"
mem: "1024m"
limit:
cores: "1"
sparkextract:
largetable:
executor:
instances: "4"
smalltable:
executor:
instances: "1"
memoryOverheadFactor: "0.1"
driver:
memory: "2g"
mem: "1024m"
limit:
cores: "1"
graphBatchCommitSize: 100 # Size of a single commit to graph in a spark job
Tip: The number of executor pods can be different for each MDM Publisher
job stage (extract
and transform
).
- If you intend to use this MDM Publisher
instance to connect to the Master Data Management service on IBM Cloud® or IBM Cloud Pak® for Data as a
Service, edit the configuration to enable the connection. For more information, see Connecting MDM Publisher to the IBM Match 360 service on Cloud Pak for Data as a Service.
- Copy and run the following sample script to quickly install and configure a Minikube
development environment, including and all of its dependency packages.
#!/bin/sh
#install docker
#install, enable and start docker
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install -y docker-ce docker-ce-cli containerd.io
sudo systemctl enable docker
sudo systemctl start docker
sudo systemctl disable firewalld
sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
sudo swapoff -a
#install kubectl
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet-1.15.4 kubectl-1.15.4
# set up network config
sudo modprobe br_netfilter
sudo cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system
sudo sysctl net.bridge.bridge-nf-call-iptables=1
sudo sysctl net.ipv4.ip_forward=1
#install socat (For example, on a Red Hat Linux system, you can install socat using the command sudo yum install socat -y)
sudo yum install socat -y
#install Java
yum install -y java-1.8.0-openjdk
#install Minikube. Detailed installation instructions are available on the Kubernetes site at https://kubernetes.io/docs/tasks/tools/install-minikube/
curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube
sudo install minikube /usr/local/bin
#start minikube with no vm driver, running containers directly on local Docker
minikube start --vm-driver=none --kubernetes-version v1.15.4
- Ensure that all of your secure endpoints are up and running.
The
MDM Publisher
security setup wizard that you will run in the next step supports the following endpoints:
- Master Data Connect:
- Master Data Connect
server
- IBM
Aspera® High-Speed Transfer Server (HSTS)
- InfoSphere MDM:
- Ongoing synchronization server (Apache Kafka)
- Database server (Db2®, Db2 for z/OS®, or Oracle)
- For virtual MDM deployments, the MDM application server (WebSphere® Application Server)
- Initialize the MDM Publisher
installation by running the following
script:
${INSTALL_LOC}/mdm-publisher/bin/init_publisher.sh
The
initialization script includes a number of startup actions, some of which require your input:
- Starts a security setup wizard. Use the wizard to facilitate the configuration of secure SSL
communication between MDM Publisher
and other systems such as InfoSphere MDM, Master Data Connect, and their
underlying systems. The wizard prompts you for parameters, imports server certificates into
corresponding MDM Publisher
trust stores, and creates necessary artifacts to facilitate secure communication. Certificates are
extracted and placed into the cert_management folder.
Note: The
Master Data Connect
installation on Minikube does not have security enabled on instances of Apache Cassandra and
Elasticsearch. As a result, when running the
init_publisher.sh
script in a Minikube
environment, skip the steps to set up a new Cassandra and Elasticsearch certificates. These
instances are not secured, and will not provide a valid certificate. Specifically, answer
N
or
n
to the following prompts:
Would you like to authorize a new Cassandra endpoint? (Yy/Nn)
and
Would you like to authorize a new elasticsearch endpoint? (Yy/Nn)
- Downloads and installs the MDM Publisher
image.
- Initializes the MDM Publisher
container.
Important: Do not try to access MDM Publisher
container until it is in a READY
state. It can take several minutes to for MDM Publisher
to successfully initialize. The first initialization will take longer than subsequent
initializations.
- Secure a connection between MDM Publisher and a
Master Data Connect instance.
- To modify the MDM Publisher
configuration for an existing MDM Publisher
deployment, complete the following steps.
- Update the appropriate configuration map YAML file with your configuration
changes:
- ${INSTALL_LOC}/mdm-publisher/ibm-publisher-services-prod/templates/publish-config-minikube.yaml
- ${INSTALL_LOC}/mdm-publisher/ibm-publisher-services-prod/templates/publisher-wlp-configmap.yaml
- ${INSTALL_LOC}/mdm-publisher/ibm-publisher-services-prod/templates/data-transfer-configuration.yaml
- Run the following script to apply the new configuration:
${INSTALL_LOC}/mdm-publisher/bin/update_configuration.sh
Tip: As an alternate method of updating the
MDM Publisher
configuration, you can use a silent (headless) mode. This method could be useful if you frequently
need to update a large number of endpoints.
- Record a silent mode response file by running the following
command.
helm get values ibm-publisher-services-prod-369994195 --namespace mdm-publisher > /root/install_bin/values-headless.yaml
- Edit the values file
/root/install_bin/values-headless.yaml
to provide the
updated information for each endpoint in a block of code. For
example:global:
authorizedEndpoints:
- type: MDC
host: jujitsu1.example.ibm.com
alias: mdc444a
port: 30299
aspera_host: jujitsu1.example.ibm.com
aspera_alias: asp444a
aspera_port: 31000
- type: MDM_jetty
host: dockermdm1.example.ibm.com
alias: jet444a
port: 4070
- type: MDM_server
host: rajumdm1.example.ibm.com
alias: mdm444a
port: 9443
- type: MDM_jetty
host: rajumdm1.example.ibm.com
alias: jet444a2
port: 4070
- Add a comment to the first line of the
/root/install_bin/values-headless.yaml
file that says USER-SUPPLIED VALUES:
- Run the following two commands to apply the changes to your existing MDM Publisher
deployment. Replace the example values with values relevant to your
deployment.
helm upgrade --namespace mdm-publisher --reuse-values --values /root/install_bin/values-headless.yaml ibm-publisher-services-prod-369994195 ./mdm-publisher/ibm-publisher-services-prod
kubectl -n mdm-publisher delete pod mdm-publisher-0 mdm-publisher-aspera-client-sts-0
What to do next
Now that you have installed and deployed
MDM Publisher
on Minikube, you might want to take the following actions: