Planning for PowerVC monitoring
As you plan your IBM® Power® Virtualization Center environment monitoring requirements, make sure that you adhere to the recommended minimum hardware resources to achieve the best possible performance for a default installation. However, there are additional aspects you might want to consider, such as data retention and backup space requirements.
Data retention
By default, the PowerVC monitoring
components are configured to hold a maximum of 10 GB of data per host. That amount is further
diminished on a multi-node installation by the setting of the replication factor for OpenSearch
data. When the number of nodes is more than one, by default the replication factor is set to
2
. That provides a (n-1) resilience factor, which means the cluster still has all data if one
of the nodes goes down.
This amount of disk space might not be adequate to store a desired time window of log entries. The number of log entries is not static – it depends on the features you are using in PowerVC and the amount of resources managed by it. It is not surprising on production systems for the log information to grow at a rate of multiple GBs per day or more.
Consequently, you (specially in production environments) might want to adjust these default retention values and provide more storage space to be used by OpenSearch.
The most important retention settings are located in the /opt/ibm/powervc-opsmgr/ansible/monitoring/vars/curator.yml file. Users are free to change these values, but should be aware that they are only applied after a config or a reset CLI command.
- curator_filter_type
- This is the type of criteria rule used to manage data pruning. Either
age
orspace
criteria are supported. If the age criteria is used, thencurator_prune_days
must also be set. If the space criteria is used, thencurator_disk_space
must also be set. The default value for this setting isspace
.
- curator_prune_days
- Specifies the number of days of data that must be retained. Any data older than this will be
pruned (removed) from the database. It defaults to 10 days. Special care must be taken when using
the
age
criteria since the amount of space that is used by log data on any given day is not necessarily the same or consistent over time. Always allow plenty of disk space on your/usr
file system for the database to store data in this case.
- curator_disk_space
- Specifies the maximum storage space in GB that can be used by database to store daily log data.
The default value is 10GB. The granularity of pruned data is a whole day’s worth. That means if your
environment produces 5 GB of data in the first day, 3 GB on the second day and 3 GB on the third
day, the first day of log data (5 GB, the oldest data first) will be pruned because the sum of all
days exceeds your limit of 10 GB. Data consumption is checked on an hourly basis, so it might take
up to an hour until the excess data is recognized and pruned by the system. For this reason, it is
always good to allocate at least an extra hour’s worth of data space to your
/usr
file system.
These settings must be properly set before installing PowerVC (if the monitoring variable in inventory
is set to True
) or later on before installing the monitoring components by using the
powervc-opsmgr monitoring --install
CLI option (if the monitoring variable wasn’t
previously set during PowerVC initial
installation). You can also, as previously stated, use other powervc-opsmgr
monitoring CLI options to update the cluster after an install: --config
or --reset (beware: a reset causes all your previously collected data to be
flushed and lost).
Also, any changes to the above settings must be accompanied by available storage in the
/usr
file system.
Backup storage requirements
Backup storage can either be on a file system or on a separate disk. For multinode installations, only the primary / bootstrap host set in inventory holds the backup data (and that storage is mounted via NFS to the other controller nodes in the cluster). The main difference between single- and multi-node installation is that the amount of storage you must plan for backup is equal to the total amount of data being stored by the database for all hosts, not only on the primary / bootstrap host.
In the single-node scenario, this is easy to calculate, since the amount of storage you need is equal to the maximum storage allowed by the data retention criteria.
However, in the multi-node scenario you must multiply that amount by the number of monitoring controller nodes you have in your cluster.
For example, if you have a 3-node PowerVC controller cluster and you set data retention criteria
of space
and curator_disk_space
is set to 100 GB , then the total amount of
space you must allocate and reserve for a backup on the primary / bootstrap is 300 GB (three times
100 GB ).
The backup settings are located in the /opt/ibm/powervc-opsmgr/ansible/monitoring/vars/elastic.yml file. Again, users are free to change these settings after an install, but they only enter in effect after a --config or a --reset CLI command.
- monitor_bak_id
- This is an identifier for your backup. A directory with the same name as the value of this
variable is created under the location of the
monitor_bak_path
variable (see below). By default it's value is set tolatest
. So, for example, if all the default values were not altered prior to installation, the backed up files would be located under /backup/latest in the file system. - monitor_bak_path
- Backup is stored here in the file system. The default is to store backups under the /backup directory in the root filesystem.
- monitor_bak_disk
- If this is set to
yes
then a new storage device is allocated (must already be visible by the operating system before PowerVC or monitoring installation) to host the backup data. In addition, if this is set toyes
, thenmonitor_bak_fdev
andmonitor_bak_fsys
must also be set. The default setting isno
, which means it reuses the disk in the system where the root filesystem (/
) is allocated. - monitor_bak_fdev
- This is the device corresponding to the additional disk that you want to use to hold the backup storage. By default, it points to the second SATA drive as recognized by the operating system (sdb) and its first partition (sdb1). You can change this setting to reflect the specific drive and partition you want to use.
- monitor_bak_fsys
- This is the type of file system you want to create on the disk or partition that is specified by
monitor_bak_fdev
. The PowerVC monitoring installation formats the partition by using the file system type that you chose. This value must be a valid type of file system that your operating system supports. By default, this is set to use thexfs
filesystem type. - monitor_bak_cidr
- This is the
cidr
of the network where your cluster is installed. It is used for the mounting of the NFS file system across all nodes in the cluster. By default this is set to22
, but please ensure that it conforms to your cluster's networkcidr
value prior to installing PowerVC monitoring or the backup and restore CLI commands will not work. It can also be altered after installation and then applied by using the CLI's --config or --reset options. - monitor_bak_zips
- Controls whether the backup is compressed after it completes or not. By default, this is set to
yes
, and a zip file is created under the location set by themonitor_bak_path
variable.
There are also other variables in the backup.yaml
file. It is recommended that
the user does not alter their values unless one is an experienced Ansible user. But keep in mind
that changes to these values are currently not supported.
Additional considerations
Initially, it is recommended you monitor the data consumption to make sure your requirements are met and the amount of log data you need to retain to adequately troubleshoot your environment is kept in the database.
Later on, if your requirements change and you need to store more data, or more data is produced by your day-to-day operations you can add additional storage to meet log data retention goals, and then update the yaml file configuration and apply it using CLI commands.
The same applies to backup space. Always plan to adjust your backup storage space in the primary / bootstrap host accordingly. It should be able to hold all data held by the database on all nodes in the cluster if multi-node installation, or the same amount in a single-node installation.