IBM Support

Reduce filesystem utilization on /opt

Troubleshooting


Problem

There are multiple reasons /opt will fill up. This article covers the 3 main culprits, and how to resolve them.

Cause


The /opt partition is relatively small (8GB), and cannot be expanded.

1. We store firmware backups to this folder. This behavior was changed in FDT 4.3 to store backups under /nzscratch
2. Multiple hardware support tools bundles installed
3. If the PureData host is unable to contact the ISOdx server, snapshots will start to collect on the system. Currently by default the ISOdx system is configured to keep 30 snapshots locally on the system. ISOdx will automatically delete the oldest snapshot if more enter the system. How ISOdx works is it will do a comparison of the last snapshot taken to the one just taken currently. The program will then delete any information that is the same. This keeps ISOdx snapshots small if nothing has changed on the server.

The problem is if the customer is constantly changing their server configuration week to week, (I.E doing NZ upgrades, or Redhat upgrades), each snapshot is going to be bigger then normal. This makes the /opt directory fill up.

Diagnosing The Problem


Confirm /opt filesystem space is a concern.

[root@NZ80641-H1 Linux]# df -h /opt
Filesystem Size Used Avail Use% Mounted on
/dev/sda8 7.8G 5.6G 1.8G 76% /opt

To see the largest directory in /opt run:

# du -sh /opt/*
0 /opt/accurev
143M /opt/accurev-4.9.1
14M /opt/apache-tomcat-7.0.35
7.5M /opt/apache-tomcat-7.0.35.tar.gz
1.4M /opt/arcconf
2.0M /opt/downloads
838M /opt/ibm
108M /opt/IBM
62M /opt/IBMmpcli
16K /opt/lost+found
5.5M /opt/lsi
13M /opt/MegaRAID
28K /opt/Netezza
1.4G /opt/nz
165M /opt/nz-hwsupport
178M /opt/nz-hwsupport-16.tar
180M /opt/nz-hwsupport-17.tar
167M /opt/nz-hwsupportold
132M /opt/nz-hwsupportperm.tar.gz
127M /opt/nz-hwsupport-tf-V8.1.1-20130408.tar.gz
163M /opt/nz-hwsupport-tf-V8.2-20130822
129M /opt/nz-hwsupport-tf-V8.2-20130822.tar.gz
1.4G /opt/nz-snapshot
170M /opt/oldnz-hwsupport.tar
16K /opt/output.log
7.1M /opt/RH-5.9-SVN.tar.gz
4.0K /opt/sc
168M /opt/source
9.7M /opt/tivoli
44K /opt/wbemsmt_setup_v1.0.sh
44K /opt/wbemsmt_setup_v1.0.sh.1

As shown in this example, the firmware folder /opt/nz is 1.4GB

The hwsupport folders, /opt/nz-hwsupport* total apx 1 GB

The ISODX snapshots, /opt/nz-snapshot total 1.4 GB.

You can also check to see how big each snapshot is by going to the snapshot directory of the server and running the list command:

1. cd /opt/nz-snapshot/isodx/custom01/Linux/out/[Name Of Server]
EX:
[root@NZ80641-H1 /]# cd /opt/nz-snapshot/isodx/custom01/Linux/out/NZ80641-H1

2. [root@NZ80641-H1 NZ80641-H1]# du -h
80M ./1387099473
13M ./1385233218
3.7M ./1383470672
84M ./1388913872 <---- SNAPSHOTS
4.7M ./1385889872
39M ./1386494672
72M ./1387704272
4.5M ./1384075472
75M ./1388309072
2.2M ./1382862272
33M ./1389638292
5.0M ./1385670393
87M ./1389518672
4.3M ./1385291612
5.4M ./1384680272
567M .

As you can see from above, the average snapshot size on this server is around 40M. On this server there has been minimal changes to the system. Because the size of a snapshot can vary dramatically, you may need to change the default save number to something lower then 15. 15 snapshots should be good for average snapshot size of 200M -300M. Adjust accordingly.

If it is found that both nz-snapshot folder and the out folder are large, reset of the configuration of ISOdx needs to take place. This requires no downtime and will take 10 minutes to complete.

NOTE:
Snapshots are all saved into a directory with a name in form of an epoch. Within these directories you will find log and err files pertaining to that snapshot. Do not delete individual log files from these directories. This will break the snapshot and render them useless for later comparison.

Resolving The Problem

The folder /opt/nz/fdt_backup is safe to move to /nzscratch folder, assuming nzscratch also has space.

All but the latest nz-hwsupport folders are safe to remove. If any issues arise, it can also be re-installed without issue.

To address ISOdx space usage, you will need to reconfigure ISOdx to hold a lower amount of snapshots so that the system will delete all older ones. This process takes about 15 minutes to complete and requires no downtime to finish.

When running the new script, it can take up to 2 hours to complete the script depending on how many changes happened on the customer servers.



1. Open snapshot script located within cron.weekly:
"vi /etc/cron.weekly/cron-nz-snapshot"

2. Modify Line 35 so that 30 says 15
/usr/sbin/isodx -o $org -i isodx.netezza.com -p isodx --snapshot --keep=15 -T /opt/nz-snapshot

3. run a copy and past of cron snapshot into tmp directory:
"cp /etc/cron.weekly/cron-nz-snapshot /tmp/cron-nz-snapshot-clean"

4. open new script :
"vi /tmp/cron-nz-snapshot-clean"

5. go down to around line 18 where is says "sleep $[ ( $RANDOM % 360 ) + 1 ]m"

6. put a "#" in front of that line:
"#sleep $[ ( $RANDOM % 360 ) + 1 ]m"

7. save and exit script

8. run "chmod +x /tmp/cron-nz-snapshot-clean"

9. run "/tmp/cron-nz-snapshot-clean"

10. Delete temp script created:
"rm /tmp/cron-nz-snapshot-clean"

After the process has finished, the /opt directory should be cleaned of 15 snapshots giving the customer more room within /opt. To check this worked run:

[root@NZ80641-H1 Linux]# df -h /opt

[root@NZ80641-H1 Linux]# du -sh /opt/nz-snapshot

[root@NZ80641-H1 Linux]# du -sh /opt/nz-snapshot/isodx/custom01/Linux/out/

If /opt is small again and /opt/.nz-snapshot is small then the problem has been fixed.

NOTE:
The reason why we create a new script and modify it is because cron-nz-snapshot is created with a sleep function within it. This means when you run the script, it will sleep for a random amount of time before running any commands, which could be up to a week. In order to fix the problem right away, we need to create an exact copy of the script, but instead get rid of the sleep function. When you run the new script created, it will run the script with out waiting and fix the problem right away.

The sleep function is there so that every customer does not upload a snapshot all at the same time. With the random sleep function, the ISOdx server does not get overloaded with snapshots being sent simultaneously.

[{"Product":{"code":"SSULQD","label":"IBM PureData System"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":null,"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.0.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 October 2019

UID

swg21661859