Monitoring a Solaris host
You can monitor your Solaris host with Instana. Instana provides comprehensive insights into the Solaris host's performance, health, and resource utilization, enabling efficient troubleshooting, performance optimization, and proactive issue detection.
System information
Instana retrieves various system details from a host. You can view the following details of a host on the Instana GUI in the System pane:
Parameter | Description |
---|---|
OS | The details of the operating system. |
CPU | The details of the CPU and the count. |
Memory | The amount of system memory in GiB (gigabytes). |
Hostname | The hostname of the machine. |
FQDN | The fully qualified domain name. It is the complete domain name of the host, including the subdomain and top-level domain. |
System ID* | The custom identifier used by Instana to uniquely represent and manage the monitored host within its monitoring. System ID is used for correlation with asset management systems. |
Host ID | The MAC address of the host's network interface, which is a unique identifier for the network adapter. |
Started At | The time at which the machine started. |
*For Solaris, you need to enable System ID by using the agent configuration YAML file as shown in the following example:
"com.instana.plugin.host":
"collectSystemId": true
Interfaces
You can find the following details:
- Interfaces: The list of network interfaces and IP addresses.
- Instana agent: The Instana agent for the host.
- Process: The count and details of the processes that are running on the host.
Performance metrics
The following performance metrics are displayed for the host.
CPU usage - percentage
The CPU usage values, when combined, provide a detailed view of how the CPU resources are being utilized on a host.
Metric | Description | Granularity |
---|---|---|
CPU Usage | The total CPU usage in percentage for the time range that you set. | 1 second |
CPU usage - total
Metric | Description | Granularity |
---|---|---|
User | The amount of CPU time spent running user-space processes (applications and services). | 1 second |
System | The amount of CPU time spent running kernel-space processes (OS core functions). | 1 second |
Wait | The amount of CPU time spent waiting for input/output operations to complete. | 1 second |
Nice | The amount of CPU time spent running processes with a lower priority (nice value). | 1 second |
Steal | The amount of CPU time lost as to the hypervisor manages other virtual machines or containers on the same physical host. | 1 second |
CPU load - average
The CPU load
metric displays the value on a graph for a selected time period.
Datapoint: Filesystem
Metric | Description | Granularity |
---|---|---|
CPU Load | The average number of processes that are run for the time range that you set. | 1 second |
CPU load - peak
Metric | Description | Granularity |
---|---|---|
Load | The peak CPU load. The highest number of processes that are run for the time range that you set. | 1 second |
Individual CPU Usage
The CPU usage
metric displays the following metrics in percentage on a graph for a selected time period for each CPU:
Metric | Description | Granularity |
---|---|---|
User | The amount of CPU time spent running user-space processes (applications and services). | 1 second |
System | The amount of CPU time spent running kernel-space processes (OS core functions). | 1 second |
Wait | The amount of CPU time spent waiting for input/output operations to complete. | 1 second |
Nice | The amount of CPU time spent running processes with a lower priority (nice value). | 1 second |
Steal | The amount of CPU time lost due to the hypervisor managing other virtual machines or containers on the same physical host. | 1 second |
Datapoint: Filesystem
Memory usage
Metric | Description | Granularity |
---|---|---|
Memory Usage | The total memory usage in percentage | 1 second |
You can measure the used
value in percentage by using the formula (total - actualFree) ÷ total
. The sensor uses the actualFree
value that is the real-constrained memory that includes free and cached memory,
instead of free
, which is a low value (used for caching or buffering).
Memory
The following table outlines the unit for memory:
Metric | Unit | Description | Granularity |
---|---|---|---|
Used | Percentage | Amount of memory in use | 1 second |
The values are displayed on a graph for a selected time period.
Datapoint: Filesystem
Open files
Open files usage when available on the operating system; current
vs max
. The values are displayed on a graph for a selected time period. The Solaris operating system has limited support. Global zone supports only
the current metric and non-global zone does not support any metrics.
Metric | Unit | Description | Granularity |
---|---|---|---|
Current | Byte | The total memory available for use by the system, including both active and inactive memory. | 1 second |
Datapoint: Filesystem
File system
These metrics provide insights into file system performance, capacity, and usage, allowing administrators to monitor and optimize their storage systems effectively.
Metric | Description | Granularity |
---|---|---|
Device | The name of the device. | 60 seconds |
Mounts | Mount location of the file system | 60 seconds |
Options | The options or parameters that are used while mounting the file system. | 60 seconds |
Free | The amount of free space available on the file system. | 1 second |
Leaked | Space that has is allocated but not used, considered "leaked" or wasted. | 1 second |
Reads/s | The number of read operations per second. | 1 second |
Writes/s | The number of write operations per second. | 1 second |
Type | The type of file system. | 60 seconds |
Capacity | The total capacity of the file system. | 60 seconds |
Used | The amount of space used on the file system. | 1 second |
Inode Usage | The percentage of inodes (data structures that describe files and directories) in use. | 1 second |
Total Utilization | The overall utilization of the file system, combining read, write, and inode usage. | 60 seconds |
Read Utilization | The utilization of read operations. | 60 seconds |
Write Utilization | The utilization of write operations. | 60 seconds |
Bytes Read/s | The number of bytes read from the file system. | 1 second |
Bytes Written/s | The number of bytes written to the file system. | 1 second |
Datapoint: Filesystem
* The total, read, and write usage datapoint metrics display the disk I/O utilization as a percentage.
* Leaked
(refers to deleted files that are in use and equates to capacity - used - free
. You can find these files with lsof | grep deleted
).
** The Total Utilization
, Read Utilization
, and Write Utilization
datapoints are not supported for Network File Systems (NFS).
By default, Instana only monitors local file systems. You can list the file systems that are monitored or excluded in the configuration.yaml
file.
The name for the configuration setting is the device name, which you can obtain from the first column of mtab
file or df
command output.
You must specify temporary file systems in the following format: tmpfs:/mount/point
.
The following example shows the list of file systems that are monitored:
com.instana.plugin.host:
filesystems:
- '/dev/sda1'
- 'tmpfs:/sys/fs/cgroup'
- 'server:/usr/local/pub'
The following example shows the file systems that are included or excluded:
com.instana.plugin.host:
filesystems:
include:
- '/dev/xvdd'
- 'tmpfs:/tmp'
- 'server:/usr/local/pub'
exclude:
- '/dev/xvda2'
Network File Systems (NFS)
To monitor all NFS, use the nfs_all: true
configuration parameter as shown in the following example:
com.instana.plugin.host:
nfs_all: true
Network interfaces
The following table outlines the network traffic and errors per an interface.
Metric | Description | Granularity |
---|---|---|
Interface | The network interface being used for communication. | 60 seconds |
Mac | The Media Access Control (MAC) address of the network interface. | 60 seconds |
IPs | The IP addresses assigned to the network interface. | 60 seconds |
RX Bytes | The total number of bytes received by the network interface per second. | 1 second |
RX Errors | Errors encountered while receiving data on the network interface. | 1 second |
TX Bytes | The total number of bytes transmitted by the network interface per second. | 1 second |
TX Errors | Errors encountered while transmitting packets on the network interface. | 1 second |
Received/s | The number of packets received by the network interface per second. | 1 second |
Transmitted/s | The number of packets transmitted by the network interface per second. | 1 second |
Datapoint: Filesystem
Process top list
These metrics offer insights into running processes, including their process ID, name, CPU usage, normalized CPU usage, and memory consumption. The top process list is updated every 30 seconds and the list contains only the processes with system usage. For example, the processes with more than 10% CPU usage over the last 30 seconds or processes with more than 512 MB memory usage (RSS) are displayed in the process top list.
To create a combined list of processes from the top 10 CPU and memory usage lists, set combineTopProcesses
to true
. The processes are included in the combined list even if their CPU usage is less than 10% or memory
usage is less than 512 MB. If the same process is listed in the top 10 CPU and top 10 memory usage lists, it is listed only once in the combined list, which can include up to 20 entries.
com.instana.plugin.host:
combineTopProcesses: true
The normalized CPU is calculated by dividing the CPU by the number of logical processors.
Metric | Description | Granularity |
---|---|---|
PID | The unique identifier that is assigned to each process by the operating system. | 30 seconds |
Process Name | The name of the process as defined by the application or service. | 30 seconds |
CPU | The amount of CPU resources consumed by the process. | 30 seconds |
CPU (normalized) | The CPU usage of the process, normalized to a scale. | 30 seconds |
Memory | The amount of memory consumed by the process. | 30 seconds |
Datapoint: Filesystem
Health signatures
For each sensor, a knowledge base of health signatures is evaluated continuously against the incoming metrics. They are used to raise issues or incidents depending on user impact.
Built-in events trigger issues or incidents based on failing health signatures on entities, and custom events trigger issues or incidents based on the thresholds of an individual metric of an entity.
For more information about the built-in events for the Host sensor, see Built-in events reference.