
I was quite tired when I spotted these benches at the Kyoto train station. Stark and cold looking, I sat down to find these to be some of the most comfortable bench seats. Much more than what I had hoped. Clearly less can be more!
IBM Tivoli Network Manager when configured for failover should always be polling or monitoring the network. This might be the primary or the backup ITNM installation. Less downtime is definitely better!
But instances occur when the activeModel file in $NCHOME/var/precision needs to be removed and model brought up empty. Perhaps to clear the ncim database or to clear up the ncimCache of the eventGateway or model. If model comes up with no topology and a failback occurs, nothing would be polled!
Take a look at this scenario for removing the model cache, and eliminating downtime with virtualdomain.
When the primary ITNM domain is stopped, the failover server takes over, and polling of the network occurs on the backup.
On the primary ITNM server
1) move the $NCHOME/var/precision activeModel out of this directory.
(A name like Store.Cache.kernel.activeModel.PCOM39)
2) copy the $NCHOME/etc/precision/CtrlServices.PCOM39.cfg and keep as a different name.
3) edit CtrlServices.PCOM39.cfg and remove the insert lines for 'poller' and 'virtualdomain'. Below you will see an example as to how many lines are involved in the insert.
4) start the primary itnm domain again.
When this domain comes up, it will discover it's network and create a new activeModel file and write it to ncim. With no poller running, nothing will be polled. With no virtualdomain, the backup continues to poll the network.
Once all processes are RUNNING and the topology is written to ncim, continue.
5) insert the processes that were removed in the edit in step 3)
[root@ncPrimary bin]#
./ncp_oql -username admin -service ctrl -domain PCOM39
ncp_oql ( IBM Tivoli Network Manager OQL Interface )
Copyright (C) 1997 - 2010 By IBM Corporation. All Rights Reserved. See product license for details.
IBM Tivoli Network Manager Version 3.9 (Build 97) created by ncpbuild at 17:09:54 Fri Feb 8 GMT 2013
Using no authentication
|ncPrimary:1.>
insert into services.inTray
(
serviceName,
binaryName,
servicePath,
domainName,
argList,
dependsOn,
retryCount
)
values
(
"ncp_virtualdomain",
"ncp_virtualdomain",
"$PRECISION_HOME/platform/$PLATFORM/bin",
"$PRECISION_DOMAIN",
[ "-domain" , "$PRECISION_DOMAIN" , "-latency" , "100000", "-debug", "0", "-messagelevel", "warn"],
[ "ncp_poller(default)", "ncp_g_event" ],
5
);
go
insert into services.inTray
(
serviceName,
binaryName,
servicePath,
domainName,
argList,
dependsOn,
retryCount
)
values
(
"ncp_poller(default)",
"ncp_poller",
"$PRECISION_HOME/platform/$PLATFORM/bin",
"$PRECISION_DOMAIN",
[ "-domain" , "$PRECISION_DOMAIN" , "-latency" , "100000", "-debug", "0", "-messagelevel", "warn" ],
[ "nco_p_ncpmonitor", "ncp_g_event" ],
5
);
go
quit
[root@ncPrimary bin]# ./itnm_status
OMNIbus:
is not installed on this system.
Network Manager:
Domain: PCOM39
ncp_ctrl RUNNING PID=18806 PCOM39
ncp_store RUNNING PID=18957 PCOM39
ncp_class RUNNING PID=18958 PCOM39
ncp_model RUNNING PID=19097 PCOM39
ncp_disco RUNNING PID=19308 PCOM39
ncp_d_helpserv RUNNING PID=18959 PCOM39
ncp_config RUNNING PID=18960 PCOM39
nco_p_ncpmonitor RUNNING PID=18961 PCOM39
ncp_g_event RUNNING PID=19309 PCOM39
ncp_webtool RUNNING PID=18962 PCOM39
ncp_virtualdomain RUNNING PID=20669 PCOM39
ncp_poller(default) RUNNING PID=20390 PCOM39
Tivoli Integrated Portal:
Server RUNNING PID=23906
[root@ncPrimary bin]#
an itnm_status will show those new processes added to the bottom of the display, and soon they will both show RUNNING.
The poller will soon failback and you will see the polling has transitioned from the backup to the primary
[root@ncPrimary bin]#
./ncp_oql -username admin -domain PCOM39 -service snmppoller -tabular
ncp_oql ( IBM Tivoli Network Manager OQL Interface )
Copyright (C) 1997 - 2010 By IBM Corporation. All Rights Reserved. See product license for details.
IBM Tivoli Network Manager Version 3.9 (Build 97) created by ncpbuild at 17:09:54 Fri Feb 8 GMT 2013
Using no authentication
|ncPrimary:1.> select * from profiling.policy;
|ncPrimary:2.> go
.
+----------+------------+----------------------+----------------------+-------------+-------------+----------------+-----------+------------+
| POLICYID | TEMPLATEID | POLICYNAME | TEMPLATENAME | TARGETCOUNT | ENTITYCOUNT | FIRSTSCOPETIME | SCOPETIME | SCOPECOUNT |
+----------+------------+----------------------+----------------------+-------------+-------------+----------------+-----------+------------+
| 891 | 1 | Default Chassis Ping | Default Chassis Ping | 103 | 103 | 161 | 0 | 1 |
+----------+------------+----------------------+----------------------+-------------+-------------+----------------+-----------+------------+
( 1 record(s) : Transaction complete )
|ncPrimary:1.>
Finally go back and move the original CtrlServices file from step 2). Use your original CtrlServices file that contains the inserts for 'poller' and 'virtualdomain'. There is not a need for the CtrlServices file that brought up the empty activeModel file.
The new topology should be transferred to the backup domain. It is possible that the way virtualdomain was inserted on the primary interfered with the connection. Keep an eye on the timestamps on the backup server in $NCHOME/var/precision for that backup server to have the new activeModel file. I saw some log messages on the primary
2014-01-29T14:15:36: Warning: W-VIR-001-027: [142019440t] Not connected to Backup domain. Unable to transfer latest polls and network views
and did a kill -9 on the <pid> of the ncp_virtualdomain on the backup server, and the topology immediately transferred.
Tags: 
virtualdomain
activemodel
failover
itnm
kyoto