Deployment options:
Netezza Performance Server for Cloud Pak for Data
System
Cloud Pak for Data
System 2.0.2.0 is the first release that comes with a Red Hat OpenShift version with production
support for configuring multipath with the system online.
Learn how to configure
multipath.conf on if you are on Cloud Pak for Data System 2.0.2.0 or later for
Netezza Performance Server to make configuration
changes.
With previous releases, you edited config files with ssh
and vi. With Cloud Pak for Data System 2.0.2.0 and later, a new procedure is
introduced because Red Hat OpenShift 4.X with Red Hat CoreOS disallows editing config files directly
on nodes and requires you to use machineconfig updates instead.
Before you begin
Cloud Pak for Data System 2.0.X with connector nodes comes preconfigured with multipath
settings for two families of IBM products that are commonly tested and used with Cloud Pak for Data
System 2.0.X. The families are:
- IBM FlashSystem
- IBM Storwize
If you have a storage product from these families, you might not need to configure the
multipath.conf file.
Gather the required multipath settings for that storage
product and check whether the vendor and product from those
settings match the following settings.
If the
vendor and
product match, skip this procedure.
devices {
device {
vendor "IBM"
product "FlashSystem-9840"
path_selector "service-time 0"
path_grouping_policy multibus
path_checker tur
rr_min_io_rq 4
rr_weight uniform
no_path_retry fail
failback immediate
}
device {
vendor "IBM"
product "2145"
path_grouping_policy "group_by_prio"
path_selector "service-time 0"
prio "alua"
path_checker "tur"
failback "immediate"
no_path_retry fail
rr_weight uniform
rr_min_io_rq "1"
}
}
About this task
During the procedure, in step 5, Netezza Performance Server is stopped for 1 - 2 hours because a new
configuration is applied and the nodes that are designated for Netezza Performance Server host pods are rebooted.
You can complete steps 1 - 4 any time before you must stop Netezza Performance Server.
Tip: If you actively use the instance on production, plan the configuration ahead of
time to account for the system outage (around 2 hours).
Procedure
- Gather and save the vendor-specific multipath device settings in a text file from an
existing Mako or other PureData System for Analytics system.
-
If you already have a PureData System for Analytics system or another family of systems to refer
to, the information is in the /etc/multipath.conf file that worked with the
wanted SAN equipment or the same family of SAN equipment.
-
If the information is not available, gather the necessary or recommended multipath settings from
vendor documentation or from a vendor contact.
- Prepare a working directory for the procedure:
- ssh to
e1n1.ssh e1n1
- Create a /root/multipath_work
directory.
mkdir -p /root/multipath_work
- Change directories to
/root/multipath_work.
cd /root/multipath_work
- Edit the
newmultipath.conf file:
- Create a copy of the /etc/multipath.conf
file.
cp /etc/multipath.conf /root/multipath_work/newmultipath.conf
newmultipath.conf
specifies the name of a temporary file that is used for editing.
- Open the
newmultipath.conf file in an
editor.vi /root/multipath_work/newmultipath.conf
- In the
Devices section, add a device that represents the
settings that are recommended by the SAN storage vendor.Important:
Do not remove anything from the file. Add your device structure after the
existing structures in the Devices section.
- Encode the changes so that you can use the information in later
steps.
base64 /root/multipath_work/newmultipath.conf | tr -d \\n > /root/multipath_work/base64multipath1.txt
- Edit the /root/multipath_work/multipath-mcp.yaml file:
- Open a new file that is called
multipath-mcp.yaml with the
vi editor and enter the insert
mode.vi /root/multipath_work/multipath-mcp.yaml
- Copy and paste the following information into the
file.
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: nps-shared
name: nps-shared-multipathing
spec:
config:
ignition:
version: 3.2.0
storage:
files:
- contents:
source: data:text/plain;charset=utf-8;base64,<--multipath_conf_file_data_base64_encoded-->
filesystem: root
mode: 420
path: /etc/multipath.conf
systemd:
units:
- name: iscsid.service
enabled: true
- name: multipathd.service
enabled: true
- Replace the
<--multipath_conf_file_data_base64_encoded--> section with the
contents from the /root/multipath_work/base64multipath1.txt file.
- Exit the insert mode and write-quit to save the file and exit the vi
session.
- Stop Netezza Performance Server:
-
oc -n NPS_NAMESPACE exec -it pod/ipshost-0 -c ipshost -- bash
-
su - nz
-
nzstop
-
exit
-
exit
-
oc -n NPS_NAMESPACE scale sts/ipshost --replicas=0
Netezza Performance Server is stopped and you are back
on e1n1.
- Unpause the
nps-shared
machineconfig pool so that the future steps do not
hang. oc patch mcp nps-shared --type json --patch '[{"op": "replace", "path": "/spec/paused", "value": false}]'
Example:
oc patch mcp nps-shared --type json --patch '[{"op": "replace", "path": "/spec/paused", "value": false}]'
machineconfigpool.machineconfiguration.openshift.io/nps-shared patched
- Start the multipath reconfiguration.
oc create -f /root/multipath_work/multipath-mcp.yaml
This command triggers a rolling reboot that is called a machineconfig update and
is managed by Red Hat OpenShift. The reboot takes place in a nondeterministic order among the nodes
in the nps-shared pool, including the connector node.
- Monitor the reboot until all of the fields in the
UPDATED column show
True.
oc get mcp
The reboot might take 5 - 15 minutes per node on the nps-shared pool.
Example:
First, the status changes to
UPDATING = True and the
nps-shared
node role changes to
SchedulingDisabled.
oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-d8537132ffb6d789ce8b2a7257833bf9 True False False 3 3 3 0 14d
nps-shared rendered-nps-shared-053d2105bc50eeb67b4cb50614e9a0da False True False 3 0 0 0 8d
unset rendered-unset-442d08db52ce65c62d60b906718744f6 True False False 0 0 0 0 14d
worker rendered-worker-442d08db52ce65c62d60b906718744f6 True False False 3 3 3 0 14d
oc get nodes
NAME STATUS ROLES AGE VERSION
e1n1-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n2-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n3-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n4.fbond Ready worker 14d v1.21.8+ee73ea2
e2n1.fbond Ready worker 14d v1.21.8+ee73ea2
e2n2.fbond Ready worker 14d v1.21.8+ee73ea2
e2n3.fbond Ready nps-shared,worker 14d v1.21.8+ee73ea2
e2n4.fbond Ready,SchedulingDisabled nps-shared,worker 14d v1.21.8+ee73ea2
e5n1.fbond Ready nps-shared,worker 13d v1.21.8+ee73ea2
Next, the
nps-shared node status changes to
NotReady:
oc get nodes
NAME STATUS ROLES AGE VERSION
e1n1-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n2-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n3-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n4.fbond Ready worker 14d v1.21.8+ee73ea2
e2n1.fbond Ready worker 14d v1.21.8+ee73ea2
e2n2.fbond Ready worker 14d v1.21.8+ee73ea2
e2n3.fbond Ready nps-shared,worker 14d v1.21.8+ee73ea2
e2n4.fbond NotReady,SchedulingDisabled nps-shared,worker 14d v1.21.8+ee73ea2
e5n1.fbond Ready nps-shared,worker 13d v1.21.8+ee73ea2
Then, the status
goes back to
Ready:
[root@gdlyos18 ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
e1n1-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n2-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n3-master.fbond Ready master 14d v1.21.8+ee73ea2
e1n4.fbond Ready worker 14d v1.21.8+ee73ea2
e2n1.fbond Ready worker 14d v1.21.8+ee73ea2
e2n2.fbond Ready worker 14d v1.21.8+ee73ea2
e2n3.fbond Ready nps-shared,worker 14d v1.21.8+ee73ea2
e2n4.fbond Ready,SchedulingDisabled nps-shared,worker 14d v1.21.8+ee73ea2
e5n1.fbond Ready nps-shared,worker 13d v1.21.8+ee73ea2
The cycle repeats for the other nodes in the nps-shared pool.
When the update completes,
UPDATING changes to
False:
oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-d8537132ffb6d789ce8b2a7257833bf9 True False False 3 3 3 0 14d
nps-shared rendered-nps-shared-053d2105bc50eeb67b4cb50614e9a0da True False False 3 3 3 0 8d
unset rendered-unset-442d08db52ce65c62d60b906718744f6 True False False 0 0 0 0 14d
worker rendered-worker-442d08db52ce65c62d60b906718744f6 True False False 3 3 3 0 14d
- SSH to the connector node and verify the multipath settings.
Important:
Do not modify any files during an ssh session. If you want to modify files,
you must do this by doing machineconfig updates.
Cloud Pak for Data System 2.0.X runs on Red Hat OpenShift 4.X with Red Hat
CoreOS as the OS for the nodes. With CoreOS, the root file system
and all config files are immutable. Any changes that you make manually during an
ssh session, are automatically wiped away by CoreOS at
unpredictable times.
Also, any changes that you make during an ssh session might cause the OS to
become out of sync with the Red Hat OpenShift cluster state and prevent future operations (for
example, Cloud Pak for Data System upgrades) from completing.
- If the connector node is
e5n1.fbond
- On Cloud Pak for Data System, from
e1n1, ssh into
e5n1.fbondssh core@e5n1
-
sudo su
- Verify that the settings were changed
successfully:
cat /etc/multipath.conf
Ensure that all the LUNs that were
configured for use are in the output of the
multipath -ll command, and that all
paths are
Active Ready Running.
multipath -ll
Tip:
If you do not see the multipath devices, verify the following items:
- The LUNs are configured properly and have access to the WWNs of the Fibre Channel cards on the
connector node.
- The FC connections are physically cabled between the SAN storage device and the SAN switch or
switches and between the SAN switch or switches and the connector node or nodes.
- The relevant ports on the SAN switches are enabled and show link.
- The multipath settings,
path_checker specifically, are correct.
If issues occur, contact IBM Support and the vendors of the customer-owned SAN equipment.
- Pause the
nps-shared
machineconfig pool to return it to the wanted state. The
nps-shared pool is paused except for select times to avoid an inadvertent
outage.
oc patch mcp nps-shared --type json --patch '[{"op": "replace", "path": "/spec/paused", "value": true}]'
Example:
oc patch mcp nps-shared --type json --patch '[{"op": "replace", "path": "/spec/paused", "value": true}]'
machineconfigpool.machineconfiguration.openshift.io/nps-shared patched
- Restart Netezza Performance Server:
oc -n NPS_NAMESPACE scale sts/ipshost --replicas=1
Wait for the ipshost pod to spawn and verify it is on a connector node. If the pod
is not up in 5 minutes, verify that there are no
issues.watch "oc -n NPS_NAMESPACE get pods -o wide | grep ipshost"
-
oc -n NPS_NAMESPACE exec -it pod/ipshost-0 -c ipshost -- bash