Linux-UNIX: S-TAP statistics

S-TAP® statistics are easily viewed in the predefined S-TAP and External S-TAP Statistics report.

You can create alerts based on results.

S-TAP statistics collection is configured with the parameter stap_statistic. This is an advanced parameter and should be modified only by Guardium Technical Support or advanced users. It specifies the interval at which the S-TAP sends statistics to the sniffer. Valid values are:
  • Positive integer: for hours
  • Negative integer: minutes
  • 0: do not send
Some statistics are cumulative and some are real time. Cumulative fields require reset to get the current count. Real-time fields are dynamic and do not require a reset. The values can be reset directly from the database server only, by running the following command:
<S-TAP Shell Install Directory>/guard_stap/ktap/current/guard_ktap_stat reset
or
<S-TAP GIM Install Directory>/modules/KTAP/current/guard_ktap_stat reset
Cumulative fields are indicated as such in the following lists.
CPU statistics
system_cpu_percent
Shows the overall CPU usage of S-TAP for the entire system. It is calculated by using the pcpu option from the ps command. These situations might indicate a problem:
  • Usage is consistently at, or near, 100%. Such a condition might indicate that the guard_stap process is stuck in a loop and is using all of the resources on one core. Run the guard_diag command when you encounter such cases.
  • Overall usage is abnormally high. This number depends on the total number of cores that are running on the system. For example, consider a consistent S-TAP CPU usage of 5% on a system with 16 cores. In this case, 5% indicates that S-TAP is consistently using 80% of one core. If S-TAP is consistently running that high, it leaves little overhead to accommodate any other spikes in traffic. Worse, S-TAPs that are running close to 100% of one core might introduce performance degradation on the host server because it can make S-TAP unresponsive to K-TAP requests.
system_cpu_idle_percent
Total system CPU free.
stap_cpu_percent
Average system CPU usage of S-TAP over life of process. This statistic shows how busy the host server is overall. For example, if the system has 10 cores, and S-TAP is using 30% of one, the overall S-TAP CPU usage is about 3%. The maximum CPU an S-TAP can use depends on the threading. See Linux-UNIX: Multi-threading S-TAP to increase S-TAP throughput.
buffer_recycled
Cumulative. Recycle count for this index S-TAP buffer.
Indicates the number of times the S-TAP buffer overflowed, which is highly indicative of S-TAP performance issues on the host server. The S-TAP buffer can overflow for several reasons, including:
  • Insufficient network bandwidth to accommodate the volume of data that is sent by the S-TAP to the Guardium appliance. This issue is most prevalent when the Guardium appliance and host database server are not in the same data center or LAN.
  • Guardium collector is too busy to handle the volume that is sent from S-TAP (this is a rare case).
timestamp
Time at which the record was created.
software_tap_host
The host system where data is collected. (tap_ip value in the S-TAP configuration.)

The reset command resets all of the K-TAP statistics.

K-TAP statistics
timestamp
Time at which the record was created.
software_tap_host
The host system where data is collected. (tap_ip value in the S-TAP configuration.)
ktap_buffer_index
The K-TAP buffer that this cluster of parameters refers to (mostly).
total_bytes_so_far
The total number of bytes that were processed by K-TAP since the last reset of these values. Reset the counters to come to any meaningful conclusions with this data.

This counter rolls back over to 0 after it reaches 4294967296 bytes (2^32). If the last reset was a while back, the counter might be a value that was rolled over several times. On its own, there is little that can be learned from looking at only the total bytes processed value. Its delta over time can be used to estimate the volume of traffic that is processed by S-TAP if K-TAP is the only driver that is used to intercept traffic. For this purpose, it is not necessary to first reset the counter. Total Bytes Processed is most helpful when it is used as baseline for some of the other statistics that are described next.

total_bytes_dropped_so_far
Cumulative. The total number of bytes that were dropped by K-TAP since the last reset of these values. Reset the counters to come to any meaningful conclusions with this data.
By default, K-TAP uses a 4 MB buffer file, which is configurable from the guard_tap.ini. If the guard_stap process cannot read data quickly enough from this buffer, K-TAP starts dropping data and the drops are reflected in this field. The significance of any drops that are shown here should be put in the context of Total Bytes Processed so Far. An excessive number of bytes dropped can be indicative of issues, including: Insufficient resources on host server for S-TAP (guard_stap process) to read data from the K-TAP buffer in a timely manner.
total_bytes_ignored
Cumulative. The amount of database traffic that is ignored at the K-TAP level when IGNORE STAP SESSION rules are implemented in the Guardium policy, since the last reset of these values. It is useful for estimating how effectively the policy is ignoring traffic. Total Bytes Ignored should be considered only after a reset and in the context of Total Bytes Processed so Far.
total_buffer_init
Cumulative. The number of times the K-TAP buffer was reinitialized. S-TAP can reinitialize the K-TAP buffer if any corruption of buffered data is detected.
ioctl_requests
Cumulative. Number of K-TAP requests issues so far (not per buffer).
total_response_bytes_ignored
Cumulative. The bytes of database response traffic that is ignored at the K-TAP level as the result of any IGNORE RESPONSES PER SESSION rules that are implemented in the Guardium policy, since the last reset of these values.
total_packet_count
Number of packets seen since the last reset of these values.
time_since_last_reset_in_seconds
Time that is elapsed since last statistics reset.
CPU statistics
system_cpu_percent
Shows the overall CPU usage of S-TAP for the entire system. It is calculated by using the pcpu option from the ps command. These situations might indicate a problem:
  • Usage is consistently at, or near, 100%. Such a condition might indicate that the guard_stap process is stuck in a loop and is using all of the resources on one core. Run the guard_diag command when you encounter such cases.
  • Overall usage is abnormally high. This number depends on the total number of cores that are running on the system. For example, consider a consistent S-TAP CPU usage of 5% on a system with 16 cores. In this case, 5% indicates that S-TAP is consistently using 80% of one core. If S-TAP is consistently running that high, it leaves little overhead to accommodate any other spikes in traffic. Worse, S-TAPs that are running close to 100% of one core might introduce performance degradation on the host server because it can make S-TAP unresponsive to K-TAP requests.
system_cpu_idle_percent
Total system CPU free.
stap_cpu_percent
Average system CPU usage of S-TAP over life of process. This statistic shows how busy the host server is overall. For example, if the system has 10 cores, and S-TAP is using 30% of one, the overall S-TAP CPU usage is about 3%. The maximum CPU an S-TAP can use depends on the threading. See Linux-UNIX: Multi-threading S-TAP to increase S-TAP throughput.
Buffer statistics
stap_buffer_usage_percent
Average percentage use of all S-TAP buffers.
buffer_recycled
Cumulative. Recycle count for this index S-TAP buffer.
Indicates the number of times the S-TAP buffer overflowed, which is highly indicative of S-TAP performance issues on the host server. The S-TAP buffer can overflow for several reasons, including:
  • Insufficient network bandwidth to accommodate the volume of data that is sent by the S-TAP to the Guardium appliance. This issue is most prevalent when the Guardium appliance and host database server are not in the same data center or LAN.
  • Guardium collector is too busy to handle the volume that is sent from S-TAP (this is a rare case).
A-TAP statistics
activated_ataps
Comma-separated list of instance names that are active.
non_activated_ataps
Comma-separated list of instance names that are inactive.
erroneous_ataps
Comma-separated list of instance names that states that appear to be improper.
dropped_priority_packets
Count of priority packets that were dropped.
Shared memory statistics
exit_number_of_shmem_segments
Total number of shmem segments.
exit_total_packets_so_far
Total packet count seen by this shm region.
exit_total_bytes_so_far
Total packet byte count seen by this shm region.
exit_total_0_16_bytes_packets
Total packet count seen where packet size > 0 && <= 16.
exit_total_16_4k_bytes_packets
Total packet count seen where packet size > 16 && <= 4000.
exit_total_4k_16k_bytes_packets
Total packet count seen where packet size > 4000 && <= 16000.
exit_total_16k_32k_bytes_packets
Total packet count seen where packet size > 16000 && <= 32000.
exit_total_32k_plus_bytes_packets
Total packet count seen where packet size > 32000.
exit_total_bytes_dropped_so_far
Total number of bytes dropped.
exit_total_packet_drops_so_far
Total number of packets dropped.
Proxy statistics
proxy_active_ssl_connections
Number of active SSL sessions through proxy.
proxy_memory_usage
Memory (in KB) that proxy is using.
proxy_cpu_usage
Amount of cpu (percentage) that proxy is using.
proxy_latency_0ms_1ms
count of packets with latency 0-1ms.
proxy_latency_1ms_10ms
count of packets with latency 1ms-10ms.
proxy_latency_10ms_100ms
Count of packets with latency 10ms-100ms.
proxy_latency_100ms_1s
Count of packets with latency 100ms-1s.
proxy_latency_1s_plus
Count of packets with latency greater than 1s.
proxy_last_minute_tcp_payload
Amount of tcp data (in bytes) handled by proxy in last minute.
proxy_last_minute_tls_payload
Amount of TLS data (in bytes) handled by proxy in last minute.
proxy_last_minute_total_connections
Total number of connections accepted by proxy in last minute.
proxy_last_minute_ssl_accepted_connections
Total number of SSL connections accepted by proxy in last minute.
S-TAP buffer statistics
stap_total_packets_dropped
Total number of packets dropped due to insufficient buffer space in the S-TAP service.
stap_total_bytes_dropped
Total number of bytes dropped due to insufficient buffer space in the S-TAP service.
stap_collector_count
Number of collectors.
stap_collector_names
Comma-delimited list of all the collectors that are assigned to the S-TAP.
stap_collector_packets_dropped
Comma-delimited list of collector packet drop count, in the same order as stap_collector_names.
stap_collector_bytes_dropped
Comma-delimited list of collector byte drop count, in the same order as stap_collector_names.