IBM Support

Setting up Alternate NIM master with secure NIMSH ( NIMSH over SSL )

How To


Summary

This article will take you trough setting up an Alternate NIM master environment by using secure NIMSH ( NIMSH over SSL ), setting up the alternate NIM, setting up the clients and verifying connectivity.

Environment

For the example,  the following configuration is used: 
Primary NIM-Akronos.aus.stglabs.ibm.com
Alternate NIM-B: hallowenbso.aus.stglabs.ibm.com
Client LPAR: robotron.aus.stglabs.ibm.com

Steps

Setting up your NIM masters. 
Setting up your NIM masters requires you to have the nim master fileset installed, the file set is called "bos.sysmgt.nim.master" and can be installed from a base media image.
For more info on NIM setup, check: https://www.ibm.com/support/pages/node/669865
Installing the NIM Master file set requires base media image, such can be obtained from the Entitled System Support site at: https://www.ibm.com/servers/eserver/ess/index.wss
The base image is in ISO format and can be mounted with the "loopmount" command like this: 
# loopmount -i /isos/AIX_v7.2_Install_7200-04-02-2027_DVD_1_of_2_072020_LCD8223013.iso -o "-V udfs -o ro" -m /mnt
 
# ls /mnt 
.Version      OSLEVEL       RPMS          image.data    ppc           usr
7200-04       README.aix    bosinst.data  installp      root

# cd /mnt/installp/ppc


At this point we can install the NIM master file set with the installp command: 
# installp -acgYXd . bos.sysmgt.nim.master
.............................

Finished processing all filesets.  (Total time:  12 secs).

+-----------------------------------------------------------------------------+
                                Summaries:
+-----------------------------------------------------------------------------+

Installation Summary
--------------------
Name                        Level           Part        Event       Result
-------------------------------------------------------------------------------
bos.sysmgt.nim.master       7.1.5.32        USR         APPLY       SUCCESS    
Once done, you can initialise the NIM master with "# smitty nimconfig" for a quick setup.
 
# smitty nimconfig 

* Network Name                                       [<Name for your NIM network>]
* Primary Network Install Interface                  [<Interface for NIM communication ,en0 etc. > ]  

  Allow Machines to Register Themselves as Clients?  [yes]                        +
  Alternate Port Numbers for Network Communications
       (reserved values will be used if left blank)
    Client Registration                              []                            #
    Client Communications                            []                            #
  IP Protocol Version Preference                     []                           +#






# lsnim -l master
master:
   class               = machines
   type                = master
   max_nimesis_threads = 20
   if_defined          = chrp.64.ent
   comments            = machine which controls the NIM environment
   platform            = chrp
   netboot_kernel      = 64
   if1                 = master_net kronos.aus.stglabs.ibm.com A62EF2408A02
   cable_type1         = N/A
   Cstate              = ready for a NIM operation
   prev_state          = ready for a NIM operation
   Mstate              = currently running
   serves              = boot
   serves              = nim_script
   master_port         = 1058
   registration_port   = 1059
   reserved            = yes

Generating SSL certificates and enabling NIMSH over SSL
When the quick master setup is complete, we can enable secure NIMSH on both NIM masters with "# nimconfig -c " 
this generates SSL certificates in the /ssl_nimsh  directory and add the ssl_support=yes value on the master: 
# nimconfig -c
0513-029 The tftpd Subsystem is already active.
Multiple instances are not supported.
NIM_MASTER_HOSTNAME=kronos.aus.stglabs.ibm.com
x - /usr/lib/libssl.so
x - /usr/lib/libcrypto.so
Target "all" is up to date.
Generating a RSA private key
...............+++++
.......+++++
writing new private key to '/ssl_nimsh/keys/rootkey.pem'
-----
Signature ok
subject=/C=US/ST=Texas/L=Austin/O=ibm.com/CN=Root CA
Getting Private key
Generating a RSA private key
............+++++
............................................................................+++++
writing new private key to '/ssl_nimsh/keys/clientkey.pem'
-----
Signature ok
subject=/C=US/ST=Texas/L=Austin/O=ibm.com
Getting CA Private Key
Generating a RSA private key
..+++++
....+++++
writing new private key to '/ssl_nimsh/keys/serverkey.pem'
-----
Signature ok
subject=/C=US/ST=Texas/L=Austin/O=ibm.com
Getting CA Private Key



# lsnim -l master | grep ssl
   ssl_support         = yes
We get a few directories, which are holding the SSL certificates for NIMSH: 
# ls -ltR /ssl_nimsh
total 8
drwx------    2 root     system         4096 Aug 18 08:46 certs
drwx------    2 root     system          256 Aug 18 08:46 keys
drwx------    2 root     system          256 Aug 18 08:46 configs
./certs:
total 96
-rw-r--r--    1 root     system         4303 Aug 18 08:46 server.pem
-rw-r--r--    1 root     system         1200 Aug 18 08:46 servercert.pem
-rw-r--r--    1 root     system           17 Aug 18 08:46 root.srl
-rw-r--r--    1 root     system         1045 Aug 18 08:46 serverreq.pem
-rw-r--r--    1 root     system         4303 Aug 18 08:46 client.pem
-rw-r--r--    1 root     system         1200 Aug 18 08:46 clientcert.pem
-rw-r--r--    1 root     system         1045 Aug 18 08:46 clientreq.pem
-rw-r--r--    1 root     system         3103 Aug 18 08:46 root.pem
-rw-r--r--    1 root     system         1399 Aug 18 08:46 rootcert.pem
-rw-r--r--    1 root     system          976 Aug 18 08:46 rootreq.pem

./keys:
total 24
-rw-r--r--    1 root     system         1704 Aug 18 08:46 serverkey.pem
-rw-r--r--    1 root     system         1704 Aug 18 08:46 clientkey.pem
-rw-r--r--    1 root     system         1704 Aug 18 08:46 rootkey.pem

./configs:
total 48
-rw-r--r--    1 root     system         1916 Aug 18 08:46 server.cnf
-rw-r--r--    1 root     system         1944 Aug 18 08:46 client.cnf
-r-xr-xr-x    1 root     system         3761 Aug 18 08:46 Makefile
-r-xr-xr-x    1 root     system         1952 Aug 18 08:46 SSL_server.cnf
-r-xr-xr-x    1 root     system         1978 Aug 18 08:46 SSL_client.cnf
-r-xr-xr-x    1 root     system         1902 Aug 18 08:46 root.cnf

Setting up alternate master ( HANIM ) and getting certificates.
Next, we need to setup the alternate master environment, which can be done via command line or smitty, in this example I'll be using command line with the "niminit" command. 
We will need to make NIM-B alternate to NIM-A and NIM-A alternate to NIM-B so, that both can takeover from each other. 
On NIM-B:
# niminit -a is_alternate=yes -a name=halloweenbso -a master=kronos -a connect=nimsh -a pif_name=en0 -a cable_type=N/A
0513-071 The nimesis Subsystem has been added.
0513-071 The nimd Subsystem has been added.
0513-059 The nimesis Subsystem has been started. Subsystem PID is 17498622.
nimsh:2:wait:/usr/bin/startsrc -e "LIBPATH=/usr/lib" -g nimclient >/dev/console 2>&1
0513-059 The nimsh Subsystem has been started. Subsystem PID is 17367336.
On NIM-A: 
# niminit -a is_alternate=yes -a name=kronos -a master=halloweenbso -a connect=nimsh -a pif_name=en0 -a cable_type=N/A
nimsh:2:wait:/usr/bin/startsrc -e "LIBPATH=/usr/lib" -g nimclient >/dev/console 2>&1
0513-044 The nimsh Subsystem was requested to stop.
0513-059 The nimsh Subsystem has been started. Subsystem PID is 12386782.
Attributes you need explained: 
name=     :   The hostname of your NIM - the one you are working on. 
master=  :   The hostname of the other NIM
connect=:   Connection protocol, only nimsh is possible. 
pif_name=   :   Network interface e.g. en0 . 
cable_type=:  This is a bit legacy but, required. Types can be "tp"( twister pair ), "bnc" ( Coaxial )  or N/A which is what is used in most modern environments considering most systems use optical cables or balanced pair cables. 
Then generate a certificate on the alternate NIM:
# nimconfig -c 

 
At this point, each NIM will see its buddy as alternate: 
[kronos.aus.stglabs.ibm.com]/
# lsnim -t alternate_master
halloweenbso     machines       alternate_master


[halloweenbso.aus.stglabs.ibm.com]/ 
 # lsnim -t alternate_master
kronos     machines       alternate_master
Next, we need to pull each masters SSL certificate: 
[kronos.aus.stglabs.ibm.com]/ssl_nimsh 
 # nimclient -o get_cert -a master_name=halloweenbso
Received 4315 Bytes in 0.0 Seconds


[halloweenbso.aus.stglabs.ibm.com]/ 
 # nimclient -o get_cert -a master_name=kronos
Received 4303 Bytes in 0.0 Seconds
This will get the other NIM's certificate in the /ssl_nimsh/certs  directory: 
[kronos.aus.stglabs.ibm.com]/ssl_nimsh/certs 
 # ls -l hallo*
-rw-r--r--    1 root     system         4315 Aug 18 09:22 halloweenbso.0


[halloweenbso.aus.stglabs.ibm.com]/ssl_nimsh/certs 
 # ls -l kronos*
-rw-r--r--    1 root     system         4303 Aug 18 09:24 kronos.0

Setting up the clients LPARs and pulling the SSL certificates. 
Something to note here is that if your clients were defined before the alternate master setup, the clients will need to be redefined (  # rm /etc/niminfo  ;  smitty niminit ) in order to pick up the new dual NIM configuration. 
If the client was defined before the alternate master setup, its /etc/niminfo file will be looking something like this: 

# cat /etc/niminfo
#------------------ Network Install Manager ---------------
# warning - this file contains NIM configuration information
#       and should only be updated by NIM
export NIM_NAME=robotron
export NIM_HOSTNAME=robotron.aus.stglabs.ibm.com
export NIM_CONFIGURATION=standalone
export NIM_MASTER_HOSTNAME=kronos.aus.stglabs.ibm.com
export NIM_MASTER_PORT=1058
export NIM_REGISTRATION_PORT=1059
export NIM_SHELL="nimsh"
export NIM_MASTERID=00FB43B94C00
export NIM_FIPS_MODE=0
export NIM_BOS_IMAGE=/SPOT/usr/sys/inst.images/installp/ppc/bos
export NIM_BOS_FORMAT=rte
export NIM_HOSTS=" 127.0.0.1:loopback:localhost  10.99.1.79:robotron.aus.stglabs.ibm.com  10.99.1.89:kronos.aus.stglabs.ibm.com "
export NIM_MOUNTS=""
export NFS_RESERVED_PORT=no
export ROUTES=" default:0:10.99.0.1 "

We can see there is no indication of a second NIM master, but once the NIMs are configured and we redefine the client by removing the niminfo file and defining the client again with "niminit", the niminfo starts looking like this: 
# rm /etc/niminfo

# niminit -a name=robotron -a master=kronos -a connect=nimsh -a pif_name=en0 -a cable_type=N/A
nimsh:2:wait:/usr/bin/startsrc -e "LIBPATH=/usr/lib" -g nimclient >/dev/console 2>&1
0513-044 The nimsh Subsystem was requested to stop.
0513-059 The nimsh Subsystem has been started. Subsystem PID is 4587702.


# cat /etc/niminfo
#------------------ Network Install Manager ---------------
# warning - this file contains NIM configuration information
#       and should only be updated by NIM
export NIM_NAME=robotron
export NIM_HOSTNAME=robotron.aus.stglabs.ibm.com
export NIM_CONFIGURATION=standalone
export NIM_MASTER_HOSTNAME=kronos.aus.stglabs.ibm.com
export NIM_MASTER_PORT=1058
export NIM_REGISTRATION_PORT=1059
export NIM_SHELL="nimsh"
export NIM_MASTERID=00FB43B94C00
export NIM_FIPS_MODE=0
export NIM_MASTER_HOSTNAME_LIST="kronos.aus.stglabs.ibm.com halloweenbso.aus.stglabs.ibm.com"
export NIMSH_AUTH="kronos.aus.stglabs.ibm.com|00FB43B94C00 halloweenbso.aus.stglabs.ibm.com|00F634864C00"
export NIM_BOS_IMAGE=/SPOT/usr/sys/inst.images/installp/ppc/bos
export NIM_BOS_FORMAT=rte
export NIM_HOSTS=" 127.0.0.1:loopback:localhost  10.99.1.79:robotron.aus.stglabs.ibm.com  10.99.1.89:kronos.aus.stglabs.ibm.com "
export NIM_MOUNTS=""
export ROUTES=" default:0:10.99.0.1 "
The most important attribute being the NIMSH_AUTH which contains the identity of both NIM masters. 
export NIMSH_AUTH="kronos.aus.stglabs.ibm.com|00FB43B94C00 halloweenbso.aus.stglabs.ibm.com|00F634864C00"
We will be defining the clients to the primary master NIM-A using the niminit command, then pull the primary master's SSL certificate and the alternate master's certificate: 
(0)[robotron] /
# niminit -a name=robotron -a master=kronos -a connect=nimsh -a pif_name=en0 -a cable_type=N/A
nimsh:2:wait:/usr/bin/startsrc -e "LIBPATH=/usr/lib" -g nimclient >/dev/console 2>&1
0513-044 The nimsh Subsystem was requested to stop.
0513-059 The nimsh Subsystem has been started. Subsystem PID is 4587692.

(0)[robotron] /
# nimclient -c
Received 4303 bytes in 0.0 seconds
0513-044 The nimsh Subsystem was requested to stop.
0513-077 Subsystem has been changed.
0513-059 The nimsh Subsystem has been started. Subsystem PID is 4587698.

(0)[robotron] /ssl_nimsh/certs
# ls -l 
total 16
-rw-r--r--    1 root     system         4303 Aug 18 09:54 kronos.aus.stglabs.ibm.com.0

(0)[robotron] /ssl_nimsh/certs
# nimclient -o get_cert -a master_name=halloweenbso.aus.stglabs.ibm.com
Received 4311 bytes in 0.0 seconds
(0)[robotron] /ssl_nimsh/certs
# ls -ltr
total 32
-rw-r--r--    1 root     system         4303 Aug 18 09:54 kronos.aus.stglabs.ibm.com.0
-rw-r--r--    1 root     system         4311 Aug 18 09:55 halloweenbso.aus.stglabs.ibm.com.0
We used "nimclient -c" to pull the primary NIM's certificate and "nimclient -o get_cert -a master_name="   to pull the SSL certificate for the alternate master at which point we will have two certificates in the /ssl_nimsh/certs  directory. 

Trying SYNC
Now that we have everything configured we can try to sync the NIM database and try a takeover test: 
The sync is executed from the Primary NIM: 
[kronos.aus.stglabs.ibm.com]/ssl_nimsh/certs 
 # nim -Fo sync hallowebso
a ./etc/objrepos/nim_attr 8 blocks
a ./etc/objrepos/nim_attr.vc 8 blocks
a ./etc/objrepos/nim_object 8 blocks
a ./etc/objrepos/nim_object.vc 8 blocks
a ./etc/NIM.level 1 blocks
a ./etc/niminfo 1 blocks
a ./etc/NIM.primary.cpuid 1 blocks
The original NIM database was backed up to the
        following location prior to this operation:
        "/export/nim/backups/halloweenbso.aus.stglabs.ibm.com.09560908182020.backup"

0513-044 The nimesis Subsystem was requested to stop.
0513-004 The Subsystem or Group, nimd, is currently inoperative.
0513-083 Subsystem has been Deleted.
0513-083 Subsystem has been Deleted.
0518-307 odmdelete: 6 objects deleted.
0518-307 odmdelete: 54 objects deleted.
Restoring the NIM database from /tmp/_nim_dir_17564140/mnt0
Level check is successful.

x ./etc/objrepos/nim_attr, 4096 bytes, 8 tape blocks
x ./etc/objrepos/nim_attr.vc, 4096 bytes, 8 tape blocks
x ./etc/objrepos/nim_object, 4096 bytes, 8 tape blocks
x ./etc/objrepos/nim_object.vc, 4096 bytes, 8 tape blocks
x ./etc/NIM.level, 8 bytes, 1 tape blocks
x ./etc/niminfo, 335 bytes, 1 tape blocks
x ./etc/NIM.primary.cpuid, 13 bytes, 1 tape blocks
0513-071 The nimesis Subsystem has been added.
0513-071 The nimd Subsystem has been added.
Finished restoring the NIM database
Updating master definition in database from hallowebso definition
  Updated master attribute platform to chrp
  Updated master attribute netboot_kernel to 64
  Updated master attribute if1 to master_net halloweenbso.aus.stglabs.ibm.com 00215EA94105 ent0
  Updated master attribute cable_type1 to N/A
Finished updating master definition
0513-059 The nimesis Subsystem has been started. Subsystem PID is 17498390.
Resetting machines
  Reset master
  Reset hallowebso
  Reset robotron
Finished resetting machines
Removing NIM client hallowebso
Finished removing hallowebso
Resetting NIM resources
Finished resetting NIM resources
Checking NIM resources
  Keeping certificate
Finished checking NIM resources
Checking NIM SPOTs
Finished checking SPOTs
nim_master_recover Complete

/usr/sbin/niminit
Testing if we can access the Master @ kronos.aus.stglabs.ibm.com
Establishing contact with NIM master 
Removing NIMSH, NIMSH_AUTH, NIM_ALTERNATE_MASTER
moving /etc/niminfo.rewrite to /etc/niminfo
Writing request to master 
Building niminfo file 
Granting master nimsh permission 
nimsh:2:wait:/usr/bin/startsrc -e "LIBPATH=/usr/lib" -g nimclient >/dev/console 2>&1
Adding routes from niminfo file 
Adding entry to inittab 
After the sync we should have all client definitions synced on both NIM servers. 

Trying TAKEOVER: 
Next we can do takeover, this is executed from the alternate NIM ( The one taking over ):
[halloweenbso.aus.stglabs.ibm.com]/ssl_nimsh/certs 
 # nim -o takeover kronos 
+-----------------------------------------------------------------------------+
                      Performing "reset" Operation
+-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+
                      "reset" Operation Summary
+-----------------------------------------------------------------------------+
 Target                  Result
 ------                  ------
 robotron                RESET

+-----------------------------------------------------------------------------+
                      Initiating "takeover" Operation
+-----------------------------------------------------------------------------+
 Initiating the takeover operation on machine 1 of 1: robotron ...

+-----------------------------------------------------------------------------+
                      "takeover" Operation Summary
+-----------------------------------------------------------------------------+
 Target                  Result
 ------                  ------
 robotron                SUCCESS

 Note: Use the lsnim command to monitor progress of "SUCCESS"
 targets by viewing their NIM database definition.

Updating the NIM database on the "alternate_master" "kronos"...
+-----------------------------------------------------------------------------+
                      "update" Operation Summary
+-----------------------------------------------------------------------------+
 Target                  Result
 ------                  ------
 robotron                SUCCESS

And the other way around: 
[kronos.aus.stglabs.ibm.com]/ssl_nimsh/certs 
 # nim -o takeover halloweenbso  
+-----------------------------------------------------------------------------+
                      Performing "reset" Operation
+-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+
                      "reset" Operation Summary
+-----------------------------------------------------------------------------+
 Target                  Result
 ------                  ------
 robotron                RESET

+-----------------------------------------------------------------------------+
                      Initiating "takeover" Operation
+-----------------------------------------------------------------------------+
 Initiating the takeover operation on machine 1 of 1: robotron ...

+-----------------------------------------------------------------------------+
                      "takeover" Operation Summary
+-----------------------------------------------------------------------------+
 Target                  Result
 ------                  ------
 robotron                SUCCESS

 Note: Use the lsnim command to monitor progress of "SUCCESS"
 targets by viewing their NIM database definition.

Updating the NIM database on the "alternate_master" "halloweenbso"...
+-----------------------------------------------------------------------------+
                      "update" Operation Summary
+-----------------------------------------------------------------------------+
 Target                  Result
 ------                  ------
 robotron                SUCCESS

Now that we have tested the Takeover on both NIM servers, we can call it a day and consider this setup successful. 
*Something to note is that takeover only works on "Stadnaldone" resources, meaning your VIOs, WPARs HMC's etc. will not be taken over by the alternate NIM, those need to be redefined and switched manually.  
Reference docs: 
By Nayden Stoyanov

Document Location

Worldwide

[{"Line of Business":{"code":"LOB08","label":"Cognitive Systems"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"ARM Category":[{"code":"a8m0z000000cvz2AAA","label":"Install-\u003ENIM"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Document Information

Modified date:
27 March 2023

UID

ibm16261423