Accounting - SMF records

Installations use Systems Management Facilities (SMF) records for the following reasons:

Performance management
Performance management includes the tasks that are related to verifying that defined service levels are met, and if not, identifying possible causes.
Aggregated information about delivered service, structured by organizational units (for which service levels have been defined) is needed to perform these tasks. These reports are typically time series with varying levels of time intervals, ranging from weeks through days to a time interval that matches the SMF interval. Some examples of potential reports related to performance management are:
  • TCP connection elapsed time per server port number per time of day (potentially broken down on source IP address, or netmask)
  • Number of TCP connections per server port number per time of day (potentially broken down on source IP address, or netmask)
  • Number of inbound/outbound bytes transferred in TCP connections per time of day (potentially broken down in various ways: per destination or source port, per destination IP address, netmask, or in total, etc.)
  • TCP retransmission activity per time of day (potentially broken down per destination IP address, or netmask)
  • Number of outbound TCP connections per time of day (potentially broken down per destination IP address, or netmask)
  • Number of inbound/outbound UDP datagrams per time of day (potentially broken down on server port number)
  • Number of discards, error packets, and unknown protocol packets inbound and outbound per time of day (potentially broken down per interface)
Capacity planning
Capacity planning includes tasks that are related to forecasting capacity in terms of central processing power, memory, channel-based I/O subsystem, network attachments, and network bandwidth. Such planning tasks are based on analyzing trends for use of capacity during a preceding period (typically one to two years), and applying forecasting metrics, along with knowledge about planned launches of new applications or use of existing applications, to this trend in order to predict capacity needs during the next one to two year period. Some examples of potential reports related to capacity planning are:
  • Total number of TCP connections per reserved server port number per day including analysis of average and variations around average during daily peak periods
  • Total number of UDP inbound/outbound UDP datagrams per reserved server port number per day including average and variations around average during daily peak periods
  • Number of bytes and/or packets transferred inbound and outbound per interface (LINK) per time of day (potentially broken down into unicasts, broadcasts, and multicasts)
  • Size of queue length per interface per time of day
Auditing
Auditing involves tasks that are related to identifying and proving that individual events have taken place. Some examples of potential reports related to auditing are:
  • Detailed information about specific TCP connections or UDP sockets, IP addresses, server/client identification, duration, number of bytes, and so on
  • Details about activity that involves a specific client or server
  • Details about a given application session based on server-specific SMF recording, such as individual Telnet sessions or FTP sessions
  • Details about changes to the TCP/IP stack profile and the user that requested the change
  • Details about changes to the status of dynamic virtual IP addresses (DVIPAs) and sysplex distributor targets
Accounting
Accounting involves tasks that are related to calculating how much each individual user or organizational unit should be charged for use of the shared central IS resources. Input to such calculations vary, but is often based on CPU cycle use, data quantities, bandwidth usage, and memory use. For TCP/IP additional metrics may be defined, such as type of service used (FTP, LPD, web server, and so on), and TCP connection-related information (number of connections, duration, byte transfer counts, and so on). Some examples of potential reports related to accounting are:
  • Aggregated number of connections to a given server from a given source in terms of a specific client IP address, or netmask
  • Accumulated connect time to a given server from a given source in terms of a specific client IP address, or netmask.
  • Number of bytes transferred to or from a given source in terms of a specific client IP address, or netmask.
  • Amount of data protected by specific manual or dynamic tunnels.
  • Application-level accounting information specific to each individual server, for example:
    • For Communications Server SMTP (CSSMTP): Information about mail message processing
    • For FTP: Number of transfer operations and bytes retrieved or stored per user ID
    • For IKED: Information about IKE tunnels
    • For TN3270: Number of sessions and session type (TN3270/TN3270E/LINEMODE)

In general, SMF records are created for deferred processing and analysis and are not used for real-time monitoring purposes. In a TCP/IP environment, real-time monitoring is implemented with the SNMP protocol and is based on internal variables that SNMP subagents maintain. However, on z/OS®, much of the information that is written in SMF records is also useful from a real-time monitoring perspective. For information about using z/OS Communications Server TCP/IP network management interfaces (NMIs) to obtain SMF records in real time, see z/OS Communications Server: IP Programmer's Guide and Reference.

As can be seen, all disciplines require detailed data as input. Depending on the discipline, certain levels of aggregation is performed on the raw detailed data in order to perform the tasks of that discipline. The objective of the TCP/IP product is to define and generate the lowest level of detail that is needed by all disciplines. How to aggregate and the actual aggregation is performed by other products, such as Performance Reporter for z/OS (PR), MVS™ Information Control System (MICS), or SAS-based tools or, in many cases, customer-written programs.

TCP/IP– produced SMF records should not be viewed isolated. Other components in MVS produce SMF records for the same purposes as those produced by TCP/IP. An installation is likely to combine information from a series of subsystems in performing detailed performance, or capacity planning. SMF records with information about use of CPU resources and memory resources per address space is, for example, produced by other components in MVS, and TCP/IP produced SMF records should not duplicate that information.

The events that trigger SMF records to be written and the information included in the SMF records must accommodate the intended purposes. There can be multiple purposes for given SMF records.

SMF records can be cut at multiple levels in the TCP/IP protocol stack, and the type of information that can be included depends on where the SMF record is created:
  • At the IP and interface layer, there is information about ICMP activity, IP packet fragmentation and reassembly activity, IP checksum errors, IGMP activity, and ARP activity. At this layer, it is difficult to relate the information to specific users (remote clients, local socket address spaces, and so on), so from an accounting point of view, this information is not very interesting. Because you can aggregate network-layer activity to physical interfaces, the information at the IP and interface layer is an important aspect of both performance and capacity management.
  • At the transport protocol layer, there is information about IP addresses, port numbers, and host names. For TCP-related workload, there is information about connections and information that is related to TCP connections, such as byte counts, connection times, reliability metrics, and performance metrics. For UDP-related workload, each UDP datagram is a separate entity; the only way to aggregate information for UDP is on a UDP socket level, where SMF records could be created every time a UDP socket is closed.
  • At the application layer, there are more details about what goes on, but every application is different and requires separate SMF record definitions and ability to write the SMF records to implement application-layer SMF recording. Application-layer SMF recording is implemented by some applications, such as the TN3270E Telnet server (Telnet), the FTP server, the IKE daemon, and the CSSMTP daemon.

For more information about the SMF records provided by z/OS Communications Server functions, see z/OS Communications Server: IP Programmer's Guide and Reference.