Running ProbeVue

Dynamic tracing is only allowed for users with privileges or for the superuser.

Authorizations and privileges

This is unlike the static tracing facilities in AIX®, which enforce relatively limited privilege checking. There is a reason for requiring privileges to run the probevue command. A Vue script can potentially produce more severe impacts on system performance than a static tracing facility like AIX system trace. This is because probe points for system trace are pre-defined and restricted. ProbeVue can potentially support many more probe points and the probe locations can potentially be defined almost anywhere. Further, ProbeVue trace actions at a probe point can take much longer to issue than the system trace actions at a probe point since those are limited to explicit data capture.

In addition, ProbeVue allows you to trace processes and read kernel global variables, both of which need to be controlled to prevent security exposures. A ProbeVue session can also consume a lot of pinned memory and restricting usage of ProbeVue to users with privilege reduces the risk of denial of service attacks. ProbeVue also allows administrators to control the memory usage of ProbeVue sessions through the SMIT interface.

Privileges for dynamic tracing are obtained differently depending upon whether role-based access control (RBAC) is enabled or not. Please refer to the AIX man pages for more information about enabling and disabling RBAC.

Note that in legacy or RBAC-disabled mode, there are no authorizations. Regular users cannot acquire privileges to run the probevue command to start a dynamic tracing session or run the probevctrl command to administer ProbeVue. Only the superuser can have privileges for both these functions. Do not disable RBAC when using ProbeVue unless you prefer to restrict this facility to root users only.

RBAC-enabled mode

Privileges in an RBAC system are obtained through authorizations. An authorization is a text string associated with security-related functions or commands. Authorizations provide the mechanism to grant rights to you to perform privileged actions. Only a user with sufficient authorization can issue the probevue command and start a dynamic tracing session.

aix.ras.probevue.trace.user.self
This authorization allows you to trace their applications in user space. The user ID of the process to be traced must be equal to the real user ID of the user invoking the probevue command. This authorization allows you to enable probe points provided by the uft probe manager for your processes. However, the effective, real and saved user IDs of the process to be traced must be equal. Thus, you cannot trace setuid programs with just this authorization.
aix.ras.probevue.trace.user
This authorization allows you to trace any application in user space including setuid programs and applications started by the superuser. Be careful when handing out this authorization. This authorization allows you to issue the probevue command and enable probe points provided by the uft probe manager for any process on the system.
aix.ras.probevue.trace.syscall.self
This authorization allows you to trace system calls made by their applications. The effective, real and saved user IDs of the process making the system call must be the same and equal to the real user ID of the user invoking the probevue command. This authorization allows you to enable probe points provided by the syscall probe manager for your processes. The second field of the probe specification must indicate the process ID for a process started by you.
aix.ras.probevue.trace.syscall
This authorization allows you to trace system calls made by any application on the system including setuid programs and applications started by the superuser. Be careful when handing out this authorization. This authorization allows you to issue the probevue command and enable probe points provided by the syscall probe manager for any process. The second field of the probe specification can either be set to a process ID to probe a specific process or to * to probe all processes.
aix.ras.probevue.trace
This authorization allows you to trace the entire system and includes all the authorizations defined in the preceding sections. You can also access and read kernel variables when running the probevue command, as well as to trace system trace events using the systrace probe manager. Be careful when handing out this authorization.
aix.ras.probevue.manage
This authorization allows you to administer ProbeVue. This includes changing the values of the different ProbeVue parameters, starting or stopping ProbeVue and viewing details of dynamic tracing sessions of all users when running the probevctrl command. Without this authorization, you can use the probevctrl command to view session data for dynamic tracing sessions started by you or view the current values for ProbeVue parameters.
aix.ras.probevue.rase
This authorization allows you to access to a highly privileged set of "RAS events" Vue functions which can produce system and LMT trace records, create live dumps, and even lead to the system abend. This privilege must be very carefully controlled.
aix.ras.probevue
This authorization grants all dynamic tracing privileges and is equivalent to all the preceding authorizations combined.

The superuser (or root) has all these authorizations assigned by default. Other users will need to have authorizations assigned to them by first creating a role with a set of authorizations and assigning the role to the user. The user will also need to switch roles to a role that has the required authorizations defined for dynamic tracing before invoking the probevue command. The following script is an example of how to provide user "joe" authorization to enable user space and system call probes for processes started by "joe".

 	mkrole authorizations=
   "aix.ras.probevue.trace.user.self,aix.ras.probevue.trace.syscall.self" 
   apptrace
	chuser roles=apptrace joe
	setkst -t role		# Copy roles to kernel (Or wait until system reboots)

User "joe" can be set up to always have all roles acquired by default when logging in or can switch to the role as needed using the following command:

swrole apptrace
Note: The interval probe manager does not have a specific authorization associated with it. You can enable interval probe points if you have any of the aix.ras.probevue.trace* authorizations.

ProbeVue privileges

The privileges that are available for ProbeVue are listed in the following table. A description of each privilege and the authorizations that map to that privilege is provided. Privileges form a hierarchy where the parent privilege contains all of the rights that are associated with the privileges of its children, but it can include additional privileges also.

Table 1. ProbeVue privileges
Privilege Description Authorizations Associated command
PV_PROBEVUE_ TRC_USER_SELF Allows a process to enable dynamic user space probe points on another process with the same real user ID. aix.ras.probevue.trace.user.self aix.ras.probevue.trace.user aix.ras.probevue.trace aix.ras.probevue probevue
PV_PROBEVUE_ TRC_USER Allows a process to enable dynamic user space probe points. Includes the PV_PROBEVUE_ TRC_USER_SELF privilege. aix.ras.probevue.trace.user aix.ras.probevue.trace aix.ras.probevue probevue
PV_PROBEVUE_ TRC_SYSCALL_SELF Allows a process to enable dynamic system call probe points on another process with the same real user ID. aix.ras.probevue.trace.syscall.self aix.ras.probevue.trace.syscall aix.ras.probevue.trace aix.ras.probevue probevue
PV_PROBEVUE_ TRC_SYSCALL Allows a process to enable dynamic system call space probe points. Includes the PV_PROBEVUE_ TRC_SYSCALL_

SELF privilege.

aix.ras.probevue.trace.syscall aix.ras.probevue.trace aix.ras.probevue probevue
PV_PROBEVUE _TRC_KERNEL Allows a process to access kernel data when dynamic tracing. aix.ras.probevue.trace aix.ras.probevue probevue
PV_PROBEVUE_ MANAGE Allows a process to administer ProbeVue. aix.ras.probevue.manage aix.ras.probevue probevctrl
PV_PROBEVUE_ RASE Authorizes the use of the restricted "RAS events" functions. aix.ras.probevue.rase aix.ras.probevue probevue
PV_PROBEVUE_ Equivalent to all the preceding privileges (PV_PROBEVUE_*) combined. aix.ras.probevue probevue probevctrl

ProbeVue parameters

AIX provides a set of parameters that you can use to tune ProbeVue or the ProbeVue framework. The parameters allow you to specify both global limits on resource usage by the ProbeVue framework and to specify resource usage for individual users.
Note: Probe managers are not contained within the ProbeVue framework and hence these limits do not apply them.

All ProbeVue parameters can be modified through the SMIT interface (use the "smit probevue" fast path) or directly through the probevctrl command. ProbeVue can be stopped if there are no active dynamic tracing sessions and it can be restarted after stopping it without requiring a reboot. ProbeVue can fail to stop if any sessions that used thread-local variables had been previously active.

The following table summarizes the parameters defined for dynamic tracing sessions. In the description, a privileged user refers to the superuser or a user with the aix.ras.probevue.trace authorization and a non-privileged user is one who does not have this authorization.

Table 2. Parameters for dynamic tracing session
Description as in SMIT Maximum value Initial high configuration value Initial low configuration value Minimum value Associated command
MAX pinned memory for ProbeVue framework 64 GB 10% of available memory or the maximum value, whichever is smaller. 16 MB 3 MB Maximum pinned memory in MB that is allocated for ProbeVue data structures, including per-CPU stacks and per-CPU local table regions and by all dynamic tracing sessions. It does not include any memory allocated by Probe Managers.
Note: Although, this parameter can be modified at any time, the value takes effect only the next time ProbeVue is started.
Default per-CPU trace buffer size 256 MB 128 KB 8 KB 4 KB Default size in KB of per-CPU trace buffer. Two trace buffers are allocated per CPU for each dynamic tracing session by ProbeVue, one active and used by the writer or the Vue program when it captures trace data and one inactive and used by the reader or the trace consumer. For example, on an 8-way with per-CPU trace buffer size set to 16 KB, the total memory consumed by the trace buffers for a ProbeVue session is 256 KB. You can specify a different buffer size (larger or smaller) when you start the probevue command until it is within the session memory limits.
MAX pinned memory for regular user sessions 64 GB 2 MB 2 MB 0 MB Maximum pinned memory allocated for a non-privileged user ProbeVue session including memory for the per-CPU trace buffers. A value of 0 effectively disables all non-privileged users. Privileged users have no limits on the memory used by their ProbeVue sessions. However, they are still limited by the maximum pinned memory allowed for the ProbeVue framework.
MIN trace buffer read rate for regular user 5000 ms 100 ms 100 ms 10 ms The minimum period, in milliseconds, that a non-privileged user can request the trace consumer to check for trace data. This value is internally rounded to the next highest multiple of 10 milliseconds. Privileged users are not limited by this parameter, but the fastest read rate that they can specify is 10 milliseconds.
Default trace buffer read rate 5000 ms 100 ms 100 ms 10 ms The default period in milliseconds that the in-memory trace buffers are checked for trace data by the trace consumer. You can specify a different read rate (larger or smaller) when starting the probevue command until it is larger than the minimum buffer read rate.
MAX concurrent sessions for regular user 8 1 1 0 Number of concurrent ProbeVue sessions allowed for a non-privileged user. A value of zero effectively disables all non-privileged users.
Size of per-CPU computation stack 256 KB 20 KB 12 KB 8 KB The size of the per-CPU computation stack used by ProbeVue when issuing the Vue script. The value is rounded to the next highest multiple of 8 KB. ProbeVue allocates a single stack per-CPU for all ProbeVue sessions. The memory consumed for the stacks is not included in the per-session limits.
Note: Although, this parameter can be modified at any time, the value takes effect only after AIX kernel boot image is rebuilt and rebooted. You have to configure ProbeVue stack to use 96K virtual memory to get the current directory listing.
Size of per-CPU local table size 256 KB 32 KB 4 KB 4 KB The size of the per-CPU local table used by ProbeVue for saving variables of automatic class and for saving temporary variables. ProbeVue uses half of this area for automatic variables and the remaining half for saving temporary variables. The value is always rounded to the next highest multiple of 4 KB. ProbeVue allocates a single local table and a single temporary table per-CPU used by all ProbeVue sessions. The memory consumed for the local tables is not included in the per-session limits.
Note: Although, this parameter can be modified at any time, the value takes effect only the next time ProbeVue is started.
MIN interval allowed in an interval probe N/A 1   1 Minimum timer interval, in milliseconds, allowed for global root user in interval probes.
Number of threads to be traced N/A 32 32 1 Maximum number of threads that a ProbeVue session can support when it has thread-local variables. The ProbeVue framework allocates the thread-local variables to the maximum number of threads that are specified with this attribute, at the start of the session. If more than the specified number of threads hit the probe that has a thread-local variable, the ProbeVue session is abruptly stopped.
Number of page faults to be handled 1024 0 0 0 Number of page fault contexts for handling page faults for the entire framework. A page fault context includes stack and local table for saving automatic class variables and temporary variables. A page fault context is required to access the paged-out data. If there are no page fault context that is free at the time of a page fault, ProbeVue does not fetch the paged-out data.
Maximum probe execution time for systrace probes when fired in interrupt context N/A 0 0 0 This number limits the maximum time, in milliseconds, a systrace probe executing in interrupt context can take. By default, the value is zero, which means the systrace probe can any time.
Maximum probe execution time for io probes when fired in interrupt context N/A 0 0 0 This number limits the maximum time, in milliseconds, an io probe executing in interrupt context can take. By default, the value is zero, which means it can any time
Maximum probe execution time for sysproc probes when fired in interrupt context N/A 0 0 0 This number limits the maximum time, in milliseconds, a sysproc probe executing in interrupt context can take. By default, the value is zero, which means it can any time.
Maximum probe execution time for network probes when fired in interrupt context N/A 0 0 0 This number limits the maximum time, in milliseconds, a network probe executing in interrupt context can take. By default, the value is zero, which means it can any time.
Max network buffer size 64 KB 64 bytes 96 bytes 96 bytes This value is a pre-allocated buffer size (in bytes) used by network probe manager for bpf probe points. This value is allocated when the first bpf probe is enabled and exists in the system till the last bpf probe is disabled. When the last bpf probe type is disabled, this buffer is released. This buffer is used to copy the data when packet data is spanned across multiple packet buffers.

Profiling ProbeVue Session

The ProbeVue framework provides a profiling facility that can be turned on or off to estimate the impact of enabled probes on the application. This facility accumulates the time taken by probe actions when they are started and reports when requested or when the session ends.

The profiling report displays the probe string and the time taken by the action corresponding to that probe string. The time that is consumed by the probe action is maintained as a list where the data collected is total, minimum, maximum, and average time taken by probe action. Profiling data also displays number of times that the probe action was timed. When you are looking up the profile for multiple functions through one probe string (by using regular expression or * in place of function name), profiling data provides an accumulated data of probes started for all such functions. It does not provide timing details for functions that are probed separately but only per-probe action.

The BEGIN and END probe actions are not profiled with this facility. These profiling details are session-specific details. You can enable probevue session profiling along with session start by using the probevue command or probevctrl command.

For more information, see the probevue and probevctrl commands.

Sample programs

Example 1

The following canonical "Hello World" program prints "Hello World" into the trace buffer and exits:

#!/usr/bin/probevue
	
	/* Hello World in probevue */
	/* Program name: hello.e */
	
	@@BEGIN
	{
		printf("Hello World\n");
		exit();
	}

Example 2

The following "Hello World" program prints "Hello World" when you types Ctrl-C on the keyboard:

#!/usr/bin/probevue
	
	/* Hello World 2 in probevue */
	/* Program name: hello2.e */
	
	@@END
	{
		printf("Hello World\n");
	}

Example 3

The following program shows how to use thread-local variables. This Vue script counts the number of bytes written to a particular file. It assumes that the processes are single-threaded or those threads that open files are the same ones that write to them. It also assumes that all write operations are successful. The script can be terminated at any time and you can obtain the current count of bytes written by typing Ctrl-C on the terminal.

#!/usr/bin/probevue
	
	/* Program name: countbytes.e */
	int open( char * Path, int OFlag, int mode );
	int write( int fd, char * buf, int sz);
	int done;

	@@syscall:*:open:entry
		when (done != 0 )
	{
		if (get_userstring(__arg1, -1) == "/tmp/foo") {
			thread:trace = 1;	
			done = 1;
		}
	}
	
	@@syscall:*:open:exit
		when (thread:trace)
	{
		thread:fd = __rv;
	}
	
	@@syscall:*:write:entry
		when (thread:trace && __arg1 == thread:fd)
	{
		bytes += __arg3;	/* number of bytes is third arg */ 
	}
	
	@@END
	{
		printf("Bytes written = %d\n", bytes);	
	}

Example 4

The following tentative tracing program shows how to trace the arguments passed to the read system call only if it returns zero bytes when reading the foo.data file:

#!/usr/bin/probevue
	/* File: ttrace.e */
	/* Example of tentative tracing */
	/* Capture parameters to read system call only if read fails */
	int open ( char* Path, int OFlag , int mode );
	int read ( int fd, char * buf, int sz);
	
	@@syscall:*:open:entry
	{
		filename = get_userstring(__arg1, -1);
		if (filename == "foo.data") {
			thread:open = 1;
			start_tentative("read");
			printf("File foo.data opened\n");
		}
	}
	
	@@syscall:*:open:exit
		when (thread:open == 1)
	{
		  thread:fd = __rv;
		  start_tentative("read");
		  printf("fd = %d\n", thread:fd);
		  thread:open = 0;
	}
		 
	@@syscall:*:read:entry
		when (__arg1 == thread:fd)
	{
		start_tentative("read");
		printf("Read fd = %d, input buffer = 0x%08x, bytes = %d,",
			__arg1, __arg2, __arg3);
		end_tentative("read");
		thread:read = 1;
	}
	
	@@syscall:*:read:exit
		when (thread:read == 1)
	{
		if (__rv < 0) {
			/* The printf below, even though non-tentative, is only
			 * executed in error cases and merges with the 
			 * previously printed tentative data
			 */
			printf(" errno = %d\n", __errno);
			commit_tentative("read");
		}
		else
			discard_tentative("read");
		thread:read = 0;
	}

A possible output if the read failed because a bad address (say 0x1000) was passed as input buffer pointer could look like the following output:

#probevue ttrace.e
File foo.data opened
fd = 4
Read fd = 4, input buffer = 0x00001000, bytes = 256, errno = 14

Example 5

The following Vue script prints the values of some kernel variables and exits immediately. Pay attention to the exit function in the @@BEGIN probe:

/* File: kernel.e */
/* Example of accessing kernel variables */
/* System configuration structure from /usr/include/sys/systemcfg.h */
struct system_configuration {
	int architecture;	/* processor architecture */
	int implementation;	/* processor implementation */
	int version;		/* processor version */
	int width;		/* width (32 || 64) */
	int ncpus;		/* 1 = UP, n = n-way MP */
	int cache_attrib;	/* L1 cache attributes (bit flags)	*/
				/* bit		0/1 meaning		*/
				/* -------------------------------------*/
				/* 31	 no cache / cache present	*/
				/* 30	 separate I and D / combined    */
	int icache_size;	/* size of L1 instruction cache */
	int dcache_size;	/* size of L1 data cache */
	int icache_asc;		/* L1 instruction cache associativity */
	int dcache_asc;		/* L1 data cache associativity */
	int icache_block;	/* L1 instruction cache block size */
	int dcache_block;	/* L1 data cache block size */
	int icache_line;	/* L1 instruction cache line size */
	int dcache_line;	/* L1 data cache line size */
	int L2_cache_size;	/* size of L2 cache, 0 = No L2 cache */
	int L2_cache_asc;	/* L2 cache associativity */
	int tlb_attrib;		/* TLB attributes (bit flags)		*/
				/* bit		0/1 meaning		*/
				/* -------------------------------------*/
				/* 31	 no TLB / TLB present		*/
				/* 30	 separate I and D / combined    */
	int itlb_size;		/* entries in instruction TLB */
	int dtlb_size;		/* entries in data TLB */
	int itlb_asc;		/* instruction tlb associativity */
	int dtlb_asc;		/* data tlb associativity */
	int resv_size;		/* size of reservation */
	int priv_lck_cnt;	/* spin lock count in supevisor mode */
	int prob_lck_cnt;	/* spin lock count in problem state */
	int rtc_type;		/* RTC type */
	int virt_alias;		/* 1 if hardware aliasing is supported */
	int cach_cong;		/* number of page bits for cache synonym */
	int model_arch;		/* used by system for model determination */
	int model_impl;		/* used by system for model determination */
	int Xint;		/* used by system for time base conversion */
	int Xfrac;		/* used by system for time base conversion */
	int kernel;		/* kernel attributes			    */
				/* bit		0/1 meaning		    */
				/* -----------------------------------------*/
				/* 31	32-bit kernel / 64-bit kernel	    */
                                /* 30   non-LPAR      / LPAR                */
                                /* 29   old 64bit ABI / 64bit Large ABI     */
                                /* 28   non-NUMA      / NUMA                */
                                /* 27   UP            / MP                  */
                                /* 26   no DR CPU add / DR CPU add support  */
                                /* 25   no DR CPU rm  / DR CPU rm  support  */
                                /* 24   no DR MEM add / DR MEM add support  */
                                /* 23   no DR MEM rm  / DR MEM rm  support  */
                                /* 22   kernel keys disabled / enabled	    */
                                /* 21   no recovery   / recovery enabled    */
                                /* 20   non-MLS    / MLS enabled	    */
	long long physmem;	/* bytes of OS available memory		    */
	int slb_attr;		/* SLB attributes			    */
				/* bit		0/1 meaning		    */
				/* -----------------------------------------*/
				/* 31		Software Managed	    */
	int slb_size;		/* size of slb (0 = no slb)		    */
	int original_ncpus;	/* original number of CPUs		    */
	int max_ncpus;		/* max cpus supported by this AIX image     */
	long long maxrealaddr;	/* max supported real memory address +1     */
	long long original_entitled_capacity;
				/* configured entitled processor capacity   */
				/* at boot required by cross-partition LPAR */
				/* tools.				    */
	long long entitled_capacity; /* entitled processor capacity	    */
	long long dispatch_wheel; /* Dispatch wheel time period (TB units)  */
	int capacity_increment;	/* delta by which capacity can change	    */
	int variable_capacity_weight;	/* priority weight for idle capacity*/
					/* distribution			    */
	int splpar_status;	/* State of SPLPAR enablement		    */
				/*	0x1 => 1=SPLPAR capable; 0=not	    */
				/*	0x2 => SPLPAR enabled 0=dedicated;   */
				/*			     1=shared       */
	int smt_status;		/* State of SMT enablement                  */
				/*    0x1 = SMT Capable  0=no/1=yes         */
				/*    0x2 = SMT Enabled  0=no/1=yes         */
				/*    0x4 = SMT threads bound true 0=no/1=yes */
	int smt_threads;	/* Number of SMT Threads per Physical CPU   */
        int vmx_version;        /* RPA defined VMX version, 0=none/disabled */
	long long sys_lmbsize;	/* Size of an LMB on this system. */
	int num_xcpus;		/* Number of exclusive cpus on line */
	signed char errchecklevel;/* Kernel error checking level */
	char pad[3];		/* pad to word boundary		*/
        int dfp_version;        /* RPA defined DFP version, 0=none/disabled */
				/* if MSbit is set, DFP is emulated         */
};

__kernel struct system_configuration _system_configuration;

@@BEGIN
{
	String s[40];
	int j;
	__kernel int max_sdl;	/* Atomic RAD system decomposition level */
	__kernel long lbolt;	/* Ticks since boot  */

	printf("No. of online CPUs\t\t= %d\n", _system_configuration.ncpus);

	/* Print SMT status */
	printf("SMT status\t\t\t=");
	if (_system_configuration.smt_status == 0)
		printf(" None");
	else {
		if (_system_configuration.smt_status & 0x01)
			printf(" Capable");
		if (_system_configuration.smt_status & 0x02)
			printf(" Enabled");
		if (_system_configuration.smt_status & 0x04)
			printf(" BoundThreads");
	}
	printf("\n");

	/* Print error checking level */
	if (_system_configuration.errchecklevel == 1)
		s = "Minimal";
	else if (_system_configuration.errchecklevel == 3)
		s = "Normal";
	else if (_system_configuration.errchecklevel == 7)
		s = "Detail";
	else if (_system_configuration.errchecklevel == 9)
		s = "Maximal";
	printf("Error checking level\t\t= %s\n",s);

	printf("Atomic RAD system detail level\t= %d\n", max_sdl);

	/* Long in the kernel is 64-bit, so we use %lld below */
	printf("Number of ticks since boot\t= %lld\n", lbolt);

	exit();
}

The following output is a possible output when you run the preceding script on a Power 5 dedicated partition with default kernel attributes:

# probevue kernel.e
No. of online CPUs              = 4
SMT status                      = Capable Enabled BoundThreads
Error checking level            = Normal
Atomic RAD system detail level  = 2
Number of ticks since boot      = 34855934

Probe managers

The probe manager is not part of the basic ProbeVue framework, but is, nevertheless, an essential component of dynamic tracing. Probe Managers are the providers of the probe points that can be instrumented by ProbeVue.

Probe managers generally support a set of probe points that belong to some common domain and share some common feature or attribute that distinguishes them from other probe points. Probe points are useful at points where control flow changes significantly, at points of state change or at other points of significant interest. Probe managers are careful to select probe points only in locations that are safe to instrument.

Probe managers can choose to define their own distinct rules for the probe specifications within the common style that must be followed for all probe specifications.

ProbeVue supports the following probe managers:

  • System call probe manager
  • User function probe manager
  • Interval probe manager
  • System trace probe manager
  • Extended System Call probe manager
  • I/O probe manager
  • Network probe manager
  • Sysproc probe manager

System call probe manager

The syscall probe manager supports probes at the entry and exit of well-defined and documented base AIX system calls. These are the system calls that have the same interface at the libc.a (or C library) entry point and in the kernel entry point. Either the system call is a pass-through (the C library simply imports the symbol from the kernel and the exports it with no code in the library) or there is trivial code for the interface inside the library.

The syscall probe manager accepts a 4-tuple probe specification in one of the following formats:

  • syscall:*:<system_call_name>:entry
  • syscall:*:<system_call_name>:exit
where the system_call_name field is to be substituted by the actual system call name. These indicate that a probe be placed at the entry and exit of system calls. Assigning the * to the second field indicates that the probe will be fired for all processes.
Note: Different privileges are required for enabling system call probes. Probing every process in the system requires higher privileges than probing your own processes.

Additionally, the syscall probe manager also accepts a 4-tuple probe specification in one of the following formats:

  • syscall:<process_ID>:<system_call_name>:entry
  • syscall:<process_ID>:<system_call_name>:exit

where a process ID can be specified as the second field of the probe specification to support probing of specific processes.

The system call names accepted by the syscall probe manager are the names of the libc.a interfaces and not the kernel's internal system call names. For example, the read subroutine is exported by libc.a, but the actual system call name or kernel entry point is kread. The syscall probe manager will internally translate a libc interface to its kernel entry point and enable the probe at entry into the kread kernel routine. Because of this, if multiple C library interfaces invoke the kread routine, the probe pointfires for those interfaces also. Generally, this is not a problem because for most of the system calls supported by the syscall probe manager, there is a 1-to-1 mapping between the libc interface and the kernel routine.

For each syscall probe, there is an equivalent probe point in the library code provided by the uft probe manager. The uft probe manager does support all library interfaces (unless it is a passthrough interface and there is no code for the call or references to it in the library at all) including those not supported by the syscall probe manager. However, the syscall probe manager has two advantages:

  • The syscall probe manager can probe every process in the system by specifying asterisk as the second field.
  • The syscall probe manager is more efficient than the uft probe manager because it does not need to switch from user mode to kernel mode and back to run the probe actions.

For more information about the full list of system calls supported by the syscall probe manager see ProbeVue.

UFT probe manager

The uft or the user function tracing probe manager supports probing user space functions that are visible in the XCOFF symbol table of a process. The uft probe manager supports probe points that are at entry and exit points of functions whose source is a C or FORTRAN language text file even though the symbol table can contain symbols whose sources are from a language other than C or FORTRAN.

The tracing of Java™ applications in a way identical to the existing tracing mechanism from the users point of view and the JVM is one that performs most of the real tasks on behalf of Probevue.

For probing java application see "Java Applications Probe Manager" below.

The uft probe manager accepts a 5-tuple probe specification in the following format:

uft:<processID>:*:<function_name>:<entry|exit>
Note: The uft probe manager requires the process ID for the process to be traced and the complete function name of the function at whose entry or exit point the probe is to be placed.

When the third field is set to *, the UFT probe manager searches the function in all of the modules loaded into the process address space including the main executable and shared modules. This implies that if a program contains more than one C function with this name (for example, functions with static class that are contained in different object modules), then probes will be applied to the entry point of every one of these functions.

If a function name in a specific module needs to be probed, the module name needs to be specified in the third field. The probe specification syntax to provide the library module name is illustrated below:

# Function foo in any module
@@uft:<pid>:*:foo:entry 
# Function foo in any module in any archive named libc.a
@@uft:<pid>:libc.a:foo:entry 
# Function foo in the shr.o module in any archive named libc.a
@@uft:<pid>:libc.a(shr.o):foo:entry	  

The function name in the fourth tuple can be specified as an Extended Regular Expression (ERE). The ERE should be enclosed between "/ and /" like "/<ERE>/".

When the function name is specified as an ERE, all the functions matching the specified regular expression in the specified module is probed.
 /* Probe entry of all libc.a functions starting with “malloc” word */
@@uft:$__CPID:libc.a: “/^malloc.*/”:entry      
/* Probe exit of all functions in the executable a.out */
@@uft:$__CPID:a.out:”/.*/”:exit

In the entry probes, where a function name is specified as a regular expression, individual arguments cannot be accessed. However, probevue function print_args can be used to print the function name and its arguments. The argument values is printed based on the argument type information available in the traceback table of the function.

In the exit probes, where a function name is specified as a regular expression, return value cannot be accessed.

Probevue supports enabling probes in more than one process at the same time. However, you will need privileges even for probing processes that belong to you.

Probevue enforces a restriction that prevents processes with user-space probes from being debugged using the ptrace or procfs based APIs.

As indicated above, the uft probe manager supports probes in shared modules like shared library modules. The following script shows an example that traces mutex activity by enabling probes in the thread library's mutex lock and unlock subroutines.

/* pthreadlocks.e */
/* Trace pthread mutex activity for a given multithreaded process */
/* The following defines are from /usr/include/sys/types.h */

typedef long long pid_t;
typedef long long thread_t;

typedef struct {
	int	__pt_mutexattr_status;	
	int	__pt_mutexattr_pshared;	
	int	__pt_mutexattr_type;
} pthread_mutexattr_t;

typedef struct __thrq_elt thrq_elt_t;

struct __thrq_elt {
	thrq_elt_t	*__thrq_next;
	thrq_elt_t	*__thrq_prev;
};

typedef volatile unsigned char _simplelock_t;

typedef struct __lwp_mutex {
	char		__wanted;
	_simplelock_t	__lock;
} lwp_mutex_t;

typedef struct {
	lwp_mutex_t		__m_lmutex;
	lwp_mutex_t		__m_sync_lock;
	int			__m_type;
	thrq_elt_t		__m_sleepq;
	int			__filler[2];
} mutex_t;

typedef struct {
	mutex_t			__pt_mutex_mutex;
	pid_t			__pt_mutex_pid;
	thread_t		__pt_mutex_owner;
	int			__pt_mutex_depth;
	pthread_mutexattr_t	__pt_mutex_attr;
} pthread_mutex_t;

int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex);

@@uft:$__CPID:*:pthread_mutex_lock:entry
{
	printf("thread %d: mutex 0x%08x locked\n", __tid, __arg1);
}

@@uft:$__CPID:*:pthread_mutex_unlock:entry
{
	printf("thread %d: mutex 0x%08x unlocked\n", __tid, __arg1);
}

The probe specification, argument access and ProbeVue functions usage in probe actions for Fortran function probes is similar to other uft probes with the following differences:

  • User has to map the Fortran data types to ProbeVue data types and use the same in the script. The mapping of Fortran basic data types to ProbeVue data types is listed in the below table.
    Table 3. Fortran to ProveVue data types mapping
    Fortran data-type ProbeVue data-type
    INTEGER * 2 short
    INTEGER * 4 int/long
    INTEGER * 8 long long
    REAL float
    DOUBLE PRECISION double
    COMPLEX No equivalent basic data type. This needs to be mapped to a structure as shown below:
    typedef struct complex {
    float a;
    float b;
    } COMPLEX;
    LOGICAL int (The Fortran standard requires logical variables to be the same size as INTEGER/REAL variables)
    CHARACTER char
    BYTE signed char
  • Fortran passes IN scalar arguments of internal procedures by value, and other arguments by reference. Arguments passed by reference should be accessed with copy_userdata(). More information on argument association in fortran can be found in the Argument association topic.
  • Routine names in a Fortran program is case in-sensitive. But, while specifying them in a ProbeVue script, they should be in lower-case .
    The following sample script illustrates how to map Fortran data types to ProbeVue data types:
    /* cmp_calc.e */
    /* Trace fortran routines 
    cmp_calc(COMPLEX, INTEGER) and 
    cmplxd(void) */
    
    typedef struct complex{
            float a;
            float b;
            } COMPLEX;
    
    typedef int INTEGER;
    
    /* arguments are indicated to be of pointer type as they are passed by reference */
    void cmp_calc(COMPLEX *, INTEGER *);  
    void cmplxd();
    
    @@uft:$__CPID:*:cmplxd:entry
    {
    printf("In cmplxd entry \n");
    }
    
    @@uft:$__CPID:*:cmp_calc:entry
    {
    COMPLEX c;
    int i;
    copy_userdata(__arg1, c);
    copy_userdata(__arg2, i);
    printf("%10.7f+j%9.7f  %d \n", c.a,c.b,i);
    }
  • Fortran stores arrays in column-major form, whereas ProbeVue stores in row-major form and the below script shows how users can retrieve the array elements.
    /* array.e*/
    /* ProbeVue script to probe fortran program array.f */
    
    void displayarray(int **, int, int);
    @@uft:$__CPID:*:displayarray:entry
    {
    int a[5][4];		/* row and column sizes are interchanged */
    copy_userdata(__arg1, a);
    /* to print the first row */
    printf("%d %d %d \n”, a[0][0], a[1][0], a[2][0]);
    /* to print the second row */
    printf(“%d %d %d\n", a[0][1], a[1][1], a[2][1]);
    }
    
    /* Fortran program array.f */
    
    PROGRAM ARRAY_PGM
    IMPLICIT NONE 
    INTEGER, DIMENSION(1:4,1:5) :: Array 
    INTEGER :: RowSize, ColumnSize
    CALL ReadArray(Array, RowSize, ColumnSize) 
    CALL DisplayArray(Array, RowSize, ColumnSize) 
    CONTAINS
    SUBROUTINE ReadArray(Array, Rows, Columns)
    IMPLICIT NONE
    INTEGER, DIMENSION(1:,1:), INTENT(OUT) :: Array
    INTEGER, INTENT(OUT) :: Rows, Columns
    INTEGER :: i, j
    READ(*,*) Rows, Columns
    DO i = 1, Rows
    READ(*,*) (Array(i,j), j=1, Columns)
    END DO
    END SUBROUTINE ReadArray
    SUBROUTINE DisplayArray(Array, Rows, Columns) 
    IMPLICIT NONE 
    INTEGER, DIMENSION(1:,1:), INTENT(IN) :: Array
    INTEGER, INTENT(IN) :: Rows, Columns 
    INTEGER :: i, j 
    DO i = 1, Rows 
    WRITE(*,*) (Array(i,j), j=1, Columns )
    END DO 
    END SUBROUTINE DisplayArray
    END PROGRAM ARRAY_PGM
  • Intrinsic or built-in functions cannot be probed with ProbeVue . All FORTRAN routines as listed in the XCOFF symbol table of the executable/linked libraries can be probed. ProbeVue uses the XCOFF symbol table to identify the location of these routines. However, the prototype for the routine has to be provided by the user and ProbeVue tries to access the arguments according to the prototype provided. For routines where the compiler mangles the routine names, the mangled name should be provided. Since Vue is a C-style language, user should ensure that the FORTRAN function/subroutine prototype is appropriately mapped to C language style function prototype. Please refer to the linkage conventions for argument passing and function return values in the Passing data from one language to another topic. The below example illustrates this:
    /* Fortran program ext_op.f */
    /* Operator “*” is extended for rational multiplication */
    MODULE rational_arithmetic
    IMPLICIT NONE
            TYPE RATNUM
                    INTEGER :: num, den
            END TYPE RATNUM
            INTERFACE OPERATOR (*)
                    MODULE PROCEDURE rat_rat, int_rat, rat_int
            END INTERFACE
            CONTAINS
            FUNCTION rat_rat(l,r)      ! rat * rat
                    TYPE(RATNUM), INTENT(IN) :: l,r
                    TYPE(RATNUM) :: val,rat_rat
                    val.num=l.num*r.num
                    val.den=l.den*r.den
                    rat_rat=val
            END FUNCTION rat_rat
            FUNCTION int_rat(l,r)      ! int * rat
                    INTEGER, INTENT(IN)      :: l
                    TYPE(RATNUM), INTENT(IN) :: r
                    TYPE(RATNUM) :: val,int_rat
                    val.num=l*r.num
                    val.den=r.den
                    int_rat=val
            END FUNCTION int_rat
            FUNCTION rat_int(l,r)      ! rat * int
                    TYPE(RATNUM), INTENT(IN) :: l
                    INTEGER, INTENT(IN)      :: r
                    TYPE(RATNUM) :: val,rat_int
                    val.num=l.num*r
                    val.den=l.den
                    rat_int=val
            END FUNCTION rat_int
    END MODULE rational_arithmetic
    PROGRAM Main1
    Use rational_arithmetic
    IMPLICIT NONE
    	TYPE(RATNUM) :: l,r,l1
    	l.num=10
            l.den=11
            r.num=3
            r.den=4
            L1=l*r
    END PROGRAM Main1
    
    /* ext_op.e */
    /* ProbeVue script to probe routine that gets called when “*”
        is used to multiply rational numbers in ext_op.f */
    
    struct rat
    {
            int num;
            int den;
    };
    struct rat rat;
    void __rational_arithmetic_NMOD_rat_rat(struct rat*,
    	struct rat*,struct rat*); 
    /* Note that the mangled function name is provided. */    
    /* Also, the structure to be returned is sent in the buffer whose address is provided as the first argument. */
    /* The first explicit parameter is in the second argument. */
    @@BEGIN
    {
            struct rat* rat3;
    }
    @@uft:$__CPID:*:__rational_arithmetic_NMOD_rat_rat:entry
    {
    	struct rat rat1,rat2;
            copy_userdata((struct rat *)__arg2,rat1);
            copy_userdata((struct rat *)__arg3,rat2);
            rat3=__arg1;
    	/* The address of the buffer where the returned structure will be stored is saved at the function entry */
            printf("Argument Passed rat_rat = %d:%d,%d:%d\n",rat1.num,rat1.den,rat2.num,rat2.den);
    }
    @@uft:$__CPID:*:__rational_arithmetic_NMOD_rat_rat:exit
    {
            struct rat rrat;
            copy_userdata((struct rat *)rat3,rrat);	
            /* The saved buffer address is used to fetch the returned structure */
            printf("Return from rat_rat = %d:%d\n",rrat.num,rrat.den);
            exit();
    }
  • ProbeVue won’t support direct inclusion of Fortran header files in the script. However, a mapping of Fortran data types to ProbeVue data types can be provided in a ProbeVue header file and included with the “-I’’ option.

C++ applications probe manager

C++ Probe Manager supports probing of C++ applications in a way identical to C probe managers. Support for "uft" style entry/exit probes on any C++ function, including member, overloaded, operator, and template functions in the core executable. A function entry/exit probe in C++ must use the @@uftxlc++ probe manager.

All tuples in the @@uftxlc++ style probe specifications have the same usage and format as for the @@uft style probe strings, with the exception of the function name. Because C++ allows a single function name to be overloaded, the function name specified in the probe string may have to include the function's argument types to uniquely identify the function being probed.

For example:

@@uftxlc++:12345:*:"foobar(int, char *)":entry
Note: The return type is missing from the above probe string because it does not take part in the name mangling algorithm for regular functions. In case of a template function, the user must specify an explicit template instantiation to probe on and must also specify the return type of the template instantiation:
@@uftxlc++:12345:*:void foobar<int>(int, char *):entry
Note: The probe strings must use quotes around the function name as specified in above two examples and the probevue command will signal an error if the quotes are missing. The quotes are not only because of the colon ":" but also because of the comma ",".The comma operator is used for separating multiple probes on the same line and without the quotes it takes precedence. This results in very strange error messages for the user.

When probing a class member function or a function defined in a namespace, the fully qualified function name must be used in the probe string. To avoid any ambiguity between the single colon (:) tuple separator in probe strings and the double colon (::) scope resolution operator in a fully qualified C++ name, the entire function name tuple in the probe string must be quoted.

@@uftxlc++:12345:*:"Foo::bar(int)":entry
Limitations:
  1. Access to data fields that are inherited from a virtual base class is not supported.
  2. Template classes are not supported and must not be included in the C++ header.
  3. Pointers to members are not supported.
  4. To probe a class with the class definition, an object of the class is instantiated in the header file either as a global object or in a dummy function.

Example:

Below is c++ application

#include  "header.cc"
main()
{
int             i = 10;
incr_num(i);
float           a = 3.14;
incr_num(a);
char            ch = 'A';
incr_num(ch);
double          d = 1.11;
incr_num(d);
}

Content of the "header.cc"

# cat header.cc
#include <iostream.h>
template <class T>
T incr_num( T a)
{
return (++a);
}
int dummy()
{
int  i=10,j=20;
incr_num(i);
float a=3.14;
incr_num(a);
char  ch ='A',dh='Z';
incr_num(ch);
double d=1.1,e=1.11;
incr_num(d);
return  0;
}

Content of the Vue script vue_cpp.e

##C++
#include "header.cc"
##Vue
@@uftxlc++:$__CPID:*:"incr_num<int>(int)":entry
{
printf("Hello1_%d\n",__arg1 );
}
@@uftxlc++:$__CPID:*:"incr_num <  float   > (float)" :entry
{
printf("Hello2_%f\n",__arg1 );
}
@@uftxlc++:$__CPID:*:"incr_num <  char    > ( char )":entry
{
printf("Hello3_%c\n",__arg1 );
}
@@uftxlc++:$__CPID:*:"incr_num <  double    > ( double )":entry
{
printf("Hello4_%lf\n",__arg1 );
exit();
}

Execution :

/usr/vacpp/bin/xlC  app.c++
#  probevue  -X ./a.out  vue_cpp.e
Hello1_10
Hello2_3.140000
Hello3_A
Hello4_1.110000
The function prototype in the fourth tuple can be specified as an Extended Regular Expression (ERE). The ERE should be enclosed between ‘”/’ and ‘/”’ like "/<ERE>/". When function prototype is specified as an ERE, all the functions matching the specified regular expression in the specified module will be probed.
/* Probe entry of all the C++ functions in the executable a.out */
@@uftxlc++:$__CPID:a.out:”/.*/”:entry
/* Probe exit of all the C++ functions with ‘foo’ word in it */
@@uftxlc++:$__CPID:*:”/foo/”:exit

In the entry probes, where a function name is specified as a regular expression, individual arguments cannot be accessed. However, probevue function print_args() can be used to print the function name and its arguments. The argument values is printed based on the argument type information available in the traceback table of the function.

In the exit probes, where a function name is specified as a regular expression, return value cannot be accessed.

Java applications probe manager

Java Probe Manager (JPM) supports probing of Java applications in a way identical to C and C++ probe managers. A single Vue script should be able to trace multiple java applications at the same time by using different process IDs of the JVMs. The same script can be used to probe syscalls or C/C++ applications along with Java applications and can use other probe managers.

Like uft (user function tracing) probe manager java probe manager also accepts 5-tuple probe specification in the following format:

uftjava :< process_ID> :*:< _qualified_function_name >: entry

Where the second tuple is the process ID of JVM process corresponding to the Java application that is being traced.

Third field: reserved for future use.

Fourth field: where the java method needs to be specified.

This name is a completely qualified name as used in java applications like Mypackage.Myclass.Mymethod.

Some of the restrictions that may apply are

  • Only pure java methods can be probed, Native (shared library calls) or encrypted codes are not traceable.
  • Only entry probes are supported.
  • Can support only JVM v 1.5 and above that supports JVMTI interface.
  • At any given point of time, no two Probevue sessions can probe the same Java application with @@uftjava.
  • Polymorphic/Overloaded methods are not supported.
  • Tracing/accessing external variables with same name as any of the Probevue keywords or built-in names are not supported. This may need those external symbols (Java application variable names) to be renamed.
  • Accessing arrays of java applications is not supported in this release.
  • Accessing arrays of java applications is not supported in this release.
  • get_function () built-in for java language is not supported in this release.
Note: In case of tracing non static methods, argument number starts with __arg2 like non static methods of C++. The __arg1 is used for self reference (this pointer).

Data Access: The action blocks of java probes can access the following data similar to existing behavior.

  • Action block can access global, local and kernel script variables.
  • Action block can access method arguments (Entry class variables) of primitive types.
  • Action block can access the built-in variables.
  • Action block can access Java application variables through fully qualified names, only static (class members).
    x = some_package.app.class.var_x;    //Access static/class member.
  • Accessing java application primitive types variables is supported; they must be converted/promoted/casted implicitly without losing value to equivalent types in Vue language. But the actual memory usage (size) may differ from that of Java language.

The functions supported in the context of Java probe manager are listed in the following table:

Table 4. Supported functions by Java probe manager
Function Description
stktrace() Provides the Stack trace of the Java application (running thread) that is being traced.
copy_userdata() Copy data from java application into script variables.
get_probe() Returns the probe string.
get_stktrace Returns the runtime stack trace.
get_location_point() Returns the current probe location.
get_userstring() Copy string data from java application.
exit() exits from the probevue trace session.

Changes to Probevue command:

Table 5. probevue command change
Command Description
-X option This option can be used (along with -A option) to launch Java application, in the current release the user has to manually pass an additional optional string agentlib:probevuejava along with all the other options that are needed to run the java application.

For Example:

probevue -X /usr/java5/bin/java -A  -agentlib:probevuejava myjavaapp  myscript.e

When running the 64 bit JVM, we have to use "agentlib:probevuejava64" as in:

probevue -X /usr/java5_64/bin/java -A  -agentlib:probevuejava64 myjavaapp  myscript.e 
where myjavaapp is the java class of myjavaapp.java application

Example ExtendedClass.java Source:

class BaseClass
{
        static int i=10;

        public static void test(int x)
        {
                i += x;
        }
}

public class ExtendedClass extends BaseClass
{
        public static void test(int x, String msg)
        {
                i += x;
                System.out.print("Java: " + msg + "\n\n");
                BaseClass.test(x);
        }

        public static void main(String[] args)
        {
                BaseClass.test(5);
                ExtendedClass.test(10, "hello");
        }
}

Example test.e script for above Java application:

@@uftjava:$__CPID:*:"BaseClass.test":entry
{
        printf("BaseClass.i: %d\n", BaseClass.i);
        printf("BaseClass.test: %d\n", __arg1);
        stktrace(0, -1);
        printf("\n");
}

@@uftjava:$__CPID:*:"ExtendedClass.test":entry
{
        printf("BaseClass.i: %d\n", BaseClass.i);
        printf("ExtendedClass.test: %d, %s\n", __arg1, __arg2);
        stktrace(0, -1);
        printf("\n");
}

Example ProbeVue session with above script:

# probevue -X /usr/java5/jre/bin/java \
-A "-agentlib:probevuejava ExtendedClass" test.e
Java: hello

BaseClass.i: 10
BaseClass.test: 5
BaseClass.test()+0
ExtendedClass.main()+1

BaseClass.i: 15
ExtendedClass.test: 10, hello
ExtendedClass.test()+0
ExtendedClass.main()+8

BaseClass.i: 25
BaseClass.test: 10
BaseClass.test()+0
ExtendedClass.test()+39
ExtendedClass.main()+8

Interval probe manager

The interval probe manager provides probe points that fire at a user-defined time-interval. The probe points are not located in kernel or application code, but instead are based on wall clock time interval based probe events.

The interval probe manager is useful for summarizing statistics collected over an interval of time. It accepts a 4-tuple probe specification in the following format:

@@interval:*:clock:<# milliseconds>

The interval probe manager will filter probe events by process ID if it is provided in the second field. Assigning the * to the second field indicates that the probe will be fired for all processes. Further, the only value supported by the interval probe manager for the third field is the clock keyword that identifies the probe specification as being for a wall clock probe. The fourth or last field, that is the <# milliseconds> field, identifies the number of milliseconds between firings of the probe. The interval probe manager requires that the value for this field consist only of digits 0-9. For interval probes without process Id, intervals should be exactly divisible by 100. Thus, probe events that are apart by 100ms, 200ms, 300ms, and so on, are allowed in non-profiling interval probes. For interval probes with process Id specified, intervals should be greater or equal to minimum interval allowed for global root user or exactly divisible by 10 for other users. Thus, probe events that are apart by 10ms, 20ms, 30ms, and so on, are allowed for normal users in profiling interval probes. Only one profiling interval probe can be active for a process.

Note: The interval probe manager does not guarantee that a probe will be fired exactly the number of milliseconds apart as indicated by value of the fourth field. Higher-priority interrupts and code that runs after disabling all interrupts can cause the probe to fire later than the specification.

The interval probe manager requires only basic dynamic tracing privileges. The interval probe manager enforces the following limits on the number of probes it supports to prevent malicious users from running the kernel out of memory by creating huge numbers of interval probes.

Table 6. Limits specified by the interval probe manager
Interval Count
Maximum number of interval probes per user 32
Maximum number of interval probes in system 1024

The interval probe manager does not support the following functions. If used inside an interval manager probe point, these functions will generate an empty string or zero as output.

  • get_function
  • get_probe
  • get_location_point

When process ID is not specified, an interval probe can trigger in the context of any process depending upon when the probe fires since the probe event is based on wall clock time. Because of this, the ProbeVue framework does not allow the use of any of the following functions inside the interval probe manager's action block to prevent unauthorized access to a process's internal data. This security violation is caught only in the kernel. The Vue script will successfully compile but the session will fail to initialize.

  • stktrace
  • get_userstring

These functions provide no value when used from the probe manager. Even if you are the root user, you cannot call these functions inside the interval probe manager.

When the process ID is specified, the interval probe is triggered for all the threads within the process at the specified time interval. As the probe is fired in the context of the process, stktrace() function and __pname built-in is allowed inside the interval probe manager’s action block, unlike when process ID is not specified.

System trace probe manager

The system trace probe manager provides probe points wherever existing system trace hooks to trace channel zero (system event channel) occur, both within the kernel and within applications. To use this probe manager, you must have the kernel access privilege, and not be running in a WPAR.

The system trace probe manager accepts a 3-tuple probe specification in the following format:

@@systrace:*:<hookid>

where the hookid argument specifies the ID for the specific system trace hook of interest. The hookid argument consists of 4 hex digits typically of the form hhh0. For example, to specify the hookid argument for the fork system call, specify 1390. See the /usr/include/sys/trchkid.h file for examples, such as HKWD_SYSC_FORK. The entries in this file are hook words, where the hookid value is in the upper halfword. Because hook words can be arbitrary, no validation of the hookid argument beyond checking that it is a valid hex string of up to 4 hex digits is performed. It is not an error to specify a hookid value that never occurs.

As a convenience, you can specify the hookid argument with fewer than 4 hex digits. In this case, first a trailing zero is assumed, and then additional leading zeroes as necessary to implicitly define the required 4 digits. For example, you can use 139 as an abbreviation of 1390. Similarly, 0100, 010, and 10 all specify the same hookid value, taken from HKWD_USER1.

You can specify the hookid argument with the * wildcard character. This will probe all system tracing, with likely unacceptable performance implications. Hence, such a specification must be used only when absolutely necessary.

The second tuple is reserved, and must be specified as an asterisk, as shown.

Only system trace events that actually occur and record system trace data trigger probes. In particular, a system trace probe can only occur when system trace is active. The systrace probe manager is an event-based probe manager. Hence, probe name, function name, and location point are not available. As the hookword is passed to the script, this is not a significant restriction.

A non-root user is limited to at most 64 systrace probes simultaneously enabled. No more than 128 explicit systrace probes can be enabled system-wide.

ProbeVue built-in register variables allow access to the data traced. You cannot use the __arg* variables for this purpose. There are two general styles for system tracing.

The following style is for the trchook(64)/utrchook(64) (or the equivalent TRCHKLx macros in C) hooks:

  • __r3 contains the 16 bit hookid.
  • __r4 contains the subhookid.
  • __r5 contains traced data word D1.
  • __r6 contains traced data word D2.
  • __r7 contains traced data word D3.
  • __r8 contains traced data word D4.
  • __r9 contains traced data word D5.

Not all trace hooks contain all 5 data words. Undefined data words from a given trace hook will appear as zero. The Vue clause for a given hook ID must know exactly what and how much data its hook ID traces.

If the trace record was produced by one of the functions in the trcgen or trcgent family, use the following style:

  • __r3 contains the 16 bit hookid.
  • __r4 contains the subhookid.
  • __r5 contains traced data word D1.
  • __r6 contains the length of the traced data.
  • __r7 contains the address of the traced data.

The following script shows a simple example of the systrace probe manager:

	@@systrace:*:1390
	{
		if (__r4 == 0) {	/* normal fork is traced with subhookid zero */
			printf(“HKWD_SYSC_FORK: %d forks child %d\n”, __pid, __r5);
			exit();
		}
	}

System trace must be active for the systrace probe to be triggered.

With appropriate privilege, a Vue script can itself generate system trace records using the "RAS events" Vue functions. However, the systrace probe manager does not detect trace records produced through a Vue script.

Extended system call probe manager (syscallx)

The syscallx probe manager, on the other hand, allows all base system calls to be traced. Base system calls is the set of system calls exported by the kernel and base kernel extensions, which are available immediately after boot-up. System calls that are exported from kernel extensions that may loaded later are not supported. Either a specific system call or all system calls can be specified through the probe point tuple. However, unlike the syscall probe manager, the third field of the probe point tuple for the syscallx must identify the actual kernel entry point function. The syscallx probe manager also limit probes to fire in a specific process if the process ID is specified as the second field of the probe point tuple.

The following are some examples:

/* Probe point tuple to probe the read system call entry for all processes */
@@syscallx:*:kread:entry
/* Probe point tuple to probe the fork system call exit for process with ID 434 */
@@syscallx:434:kfork:exit
/* Probe point tuple to probe entry for all base system calls */
@@syscallx:*:*:entry
/* Probe point tuple to probe exit for all base system calls for process 744 */
@@syscallx:744:*:exit

System calls supported by the syscall probe manager

The following table lists the system calls supported by the syscall probe manager along with the actual entry name in the kernel.
Note: The kernel entry name is provided here only for documentation purposes. The kernel entry names can change between releases or even after a service update.
Table 7. System calls supported by the syscall probe manager
System call name Kernel entry name
absinterval absinterval
accept accept1
bind bind
close close
creat creat
execve execve
exit _exit
fork kfork
getgidx getgidx
getgroups getgroups
getinterval getinterval
getpeername getpeername
getpid _getpid
getppid _getppid
getpri _getpri
getpriority _getpriority
getsockname getsockname
getsockopt getsockopt
getuidx getuidx
incinterval incinterval
kill kill
listen listen
lseek klseek
mknod mknod
mmap mmap
mq_close mq_close
mq_getattr mq_getattr
mq_notify mq_notify
mq_open mq_open
mq_receive mq_receive
mq_send mq_send
mq_setattr mq_setattr
mq_unlink mq_unlink
msgctl msgctl
msgget msgget
msgrcv __msgrcv
msgsnd __msgsnd
nsleep _nsleep
open kopen
pause _pause
pipe pipe
plock plock
poll _poll
read kread
reboot reboot
recv _erecv
recvfrom _enrecvfrom
recvmsg _erecvmsg
select _select
sem_close _sem_close
sem_destroy sem_destroy
sem_getvalue sem_getvalue
sem_init sem_init
sem_open _sem_open
sem_post sem_post
sem_unlink sem_unlink
sem_wait _sem_wait
semctl semctl
semget semget
semop __semop
semtimedop __semtimedop
send _esend
sendmsg _esendmsg
sendto _esendto
setpri _setpri
setpriority _setpriority
setsockopt setsockopt
setuidx setuidx
shmat shmat
shmctl shmctl
shmdt shmdt
shmget shmget
shutdown shutdown
sigaction _sigaction
sigpending _sigpending
sigprocmask sigprocmask
sigsuspend _sigsuspend
socket socket
socketpair socketpair
stat statx
waitpid kwaitpid
write kwrite

Running in a WPAR

Workload partitions or WPARs are virtualized operating system environments within a single instance of the AIX operating system. The WPAR environment is somewhat different from the standard AIX operating system environment.

Dynamic tracing is supported in the WPAR environment. By default, when creating a WPAR, only the PV_PROBEVUE_TRC_USER_SELF and the PV_PROBEVUE_TRC_USER privileges are assigned to the WPAR and the superuser (root) on a WPAR system will be granted these privileges. An admin user from the global partition can change the value of the default WPAR privilege set or can explicitly assign additional privileges when creating the WPAR.

Privileges on WPAR have generally the same meanings as on a global partition. Be careful when assigning PV_PROBEVUE_TRC_KERNEL or the PV_PROBEVUE_TRC_MANAGE to a WPAR. Any user with PV_PROBEVUE_TRC_KERNEL privilege can access global kernel variables while a user with PV_PROBEVUE_TRC_MANAGE privilege can change the values of ProbeVue parameters or shutdown ProbeVue. These changes affect all users even those in other partitions.

When you issue the probevue command in a WPAR, processes running in other WPARs or in the global partition are not visible to it. Because of this, you can only probe processes in your same WPAR. The probevue command will fail if the probe specification contains a process ID that is outside its partition. The PV_PROBEVUE_TRC_USER and PV_PROBEVUE_TRC_SYSCALL privileges in a WPAR only allow you to probe user space functions or system calls of processes that are in your WPAR. When probing system calls, the second field of the syscall probe specification must be set to a valid WPAR-visible process ID. Assigning the value * to the second field is not supported.

When a ProbeVue session is initiated in a mobile WPAR, it temporarily switches the WPAR to a non-checkpointable state. After the ProbeVue session terminates, the WPAR is checkpointable again.

I/O probe manager

I/O probe manager provides capabilities to trace I/O operation events in various layers of AIX I/O stack. Use the syscall probe manager to trace application I/O request that is triggered by a read/write system call. Use I/O probe manager to probe further into the syscall layer.

Use I/O probe manager to analyze response time of I/O operations of a block device that segregates the service time and queuing delay.

The following layers are supported:

  • Logical File System (LFS)
  • Virtual File System (VFS)
  • Enhanced Journaled File Systems (JFS2)
  • Logical Volume Manager (LVM)
  • Small Computer System Interface (SCSI) disk driver
  • Generic block devices

The primary use cases for I/O probe manager are as follows:

  • Identify the following patterns of I/O usage of a device. Valid devices can be a disk, logical volume, or volume group, or file system (type or mount path) in a specified time period:
    • I/O operation count
    • Size of I/O operations
    • Type of I/O operation (read/write)
    • Sequential or random nature of I/O
  • Get process or thread-wise usage information of a file system (type or mount path), logical volume, volume group, or disk.
  • Get an end-to-end mapping of I/O flow among various layers (wherever possible).
  • Monitor a specific I/O resource usage. For example:
    • Trace any write operations of the /etc/password file.
    • Trace read operation on block 0 of the hdisk0 device.
    • Trace when a new logical volume is opened in root volume group (rootvg).
  • For Multipath I/O (MPIO) disks, get path-specific information by the following actions:
    • Get path-wise usage and response time information.
    • Identify path switching or path failure.
  • For I/O errors, get more details about the error in disk driver layer.

Probe specification

I/O probes must be specified in the following format in Vue script:

@@io:sub_type:io_event:operation_type:filter[|filter …]
This specification consists of five tuples that are separated by colon (:). The first tuple is always @@io.

Probe sub type

The second tuple signifies the sub type of the probe that indicates the layer of AIX I/O stack that contains the probe. This tuple can have one of the following values:

Table 8. Second tuple for probes
Second tuple (sub type) Description
disk This probe starts for disk driver events. Currently, the I/O probe manager supports only the scsidisk driver.
lvm This probe starts for Logical Volume Manager (LVM) events.
bdev This probe starts for any block I/O device. Disk, CD-ROM, diskette are examples of block devices. This sub type is used only when no other sub type is applicable. For example, if a block device is not a disk, volume group, or logical volume, this sub type is applicable.
jfs2 This probe starts for JFS2 file system events.
vfs This probe starts for any read/write operation on a file.
Note: The second tuple cannot have a value of asterisk (*).

For a disk type of second tuple, the third tuple can have the following values:

Table 9. Disk second tuple: Third tuple values
Sub type (Second tuple) I/O event (Third Tuple) Description
disk entry This probe starts whenever disk driver receives an I/O request to process.
iostart This probe starts when the disk driver picks up an I/O request from its ready queue and sends it down to lower layer (for example, adapter driver). A single original I/O request to disk driver can send multiple command requests (some might be driver-related task management command requests) to lower layer. However, sometimes the driver can combine multiple original requests and send a single request to lower layer.
iodone This probe starts when the lower layer (for example, adapter driver) returns an I/O request (successful or failed) to disk driver.
exit This probe starts when disk driver returns an I/O request (successful or failed) to its upper layer.
Note: The members of the following built-in values are available in the probes that are mentioned for the probe sub type: __iobuf, __diskinfo, __diskcmd (only in disk:iostart and disk:iodone), and __iopath (only in disk:iostart and disk:iodone).

For every entry, a corresponding exit probe is defined that has the same __iobuf->bufid value available at both the probe points. The entry event can be followed by multiple iostart events, but at least one of them must have the same __iobuf->bufid value. Every iostart event has a matching iodone event that has the same __iobuf->child_bufid value.

For an LVM type of second tuple, the third tuple can have the following values:

Table 10. LVM second tuple: Third tuple values
Sub type (second tuple) I/O event (third tuple) Description
lvm entry This probe starts whenever the LVM layer receives an I/O request to process.
iostart This probe starts when LVM picks an I/O request from its ready queue and sends down to the lower layer (usually the disk driver).
iodone This probe starts when the lower layer (for example, disk driver) returns an I/O request (successful or failed) to LVM.
exit This probe starts when LVM returns an I/O request (successful or failed) to its upper layer.
Note: The members of the following built-ins values are available in the probes that are mentioned for LVM: __iobuf, __lvol, and __volgrp. Every entry has a corresponding exit probe, which has the same __iobuf->bufid value available at both the probe points.

The entry event can be followed by multiple iostart events, but at least one of them has the same __iobuf->bufid value. Every iostart event has a matching iodone event that has the same __iobuf->child_bufid value.

For generic block device probes, the third tuple can have the following values:

Table 11. Generic block device second tuple: Third tuple values
Sub type (second tuple) I/O event (third tuple) Description
bdev iostart This probe gets fired when any block I/O (for example, disk, logical volume, CD-ROM) device is initiated. It happens when the AIX devstrat kernel service is called by any code.
iodone This probe gets fired when a block I/O request completion happens, when the AIX iodone kernel service is called by any code.
Note: The members of the following built-in values are available in the probes that are mentioned in bdev: __iobuf. Every iostart event has a matching iodone event that has the same __iobuf->bufid value.

For JFS2 file system probes, the third tuple can have the following values:

Table 12. JFS2 second tuple: Third tuple values
Sub type (second tuple) I/O event (third tuple) Description
jfs2 buf_map This probe starts when a logical file extent gets mapped to an I/O buffer and is sent to the underlying logical volume.
Note: The members of the following built-in values are available in the probe that is mentioned for JFS2 file system probes: __j2info.

For Virtual file system (VFS) probes, the third tuple can have the following values:

Table 13. VFS second tuple: Third tuple values
Sub type (second tuple) I/O event (third tuple) Description
vfs entry This probe starts when any read/write operation on a file is initiated.
exit This probe starts when any read/write operation on a file is completed (whether success or failure).
Note: The members of the following built-in are available in the probe that is mentioned in VFS probes: __file.

For the same thread, every entry is followed by an exit event that has the same __file->inode_id value.

Probe operation type

The fourth tuple indicates the type of I/O operation that is specified by the probe. The fourth tuple can have one of the following values:

Table 14. Fourth tuple for I/O operation
Fourth tuple Description
read The probe starts for only the read operation.
write The probe starts for only the write operation.
* The probe starts for both read and write operations.

Probe filter

The fifth tuple is the filter tuple that helps in filtering more specific probes according to the requirement. The possible values are subtype dependent. Multiple values can be specified separated by | character, and the probe starts if it matches any of those filters. If the value of the fifth tuple is *, no filtering occurs and the probe starts if other tuples match. If multiple selectors are specified, and one of them is *, it is equivalent to the whole tuple value of *.

For disk probes, the fifth tuple can have the following values:

Table 15. Disk filter tuple
Filter (fifth tuple) Description
Disk name. For example, hdisk0 The probe action is run only for the particular disk.
Disk type. Allowed symbols: FC, ISCSI, VSCSI, SAS The probe action is run only for disks with matching type. The meanings of the symbols are as follows:
  • FC: Fibre Channel disk
  • ISCSI: iSCSI disk
  • VSCSI: Virtual SCSI disk (on VIOS client)
  • SAS: Serial Attached SCSI disks
Note: The disk name and disk type can be combined as filters. For example, the following probe starts for either hdisk0 or any other FC disk (at disk entry event, for both read/write operation type)
@@io:disk:entry:*:hdisk0|FC

For Logical Volume Manager (LVM) probes, the fifth tuple can have the following values:

Table 16. LVM filter tuple
Filter (fifth Tuple) Description
Logical volume name, for example hd5, lg_dumplv The probe action is run only for the particular logical volume.
Volume group name, for example rootvg The probe action is run only for those logical volumes that belong to a particular volume group.

The following probe starts for any logical volume that belongs to either root volume group (rootvg), or test volume group (testvg) (at iostart event, for write operation only):

@@io:lvm:iostart:write:rootvg|testvg

For generic block device probes, fifth tuple can have following values:
Table 17. Generic block device filter tuple
Filter (fifth tuple) Description
Block device name, for example: hdisk0, hd5, cd0 The probe action is run only for the particular block device.

Consider the following examples for generic block device probes:

@@io:bdev:iostart:*:cd0

@@io:bdev:iodone:read:hdisk3|hdisk5

For JFS2 file system probes, the fifth tuple can have following values:

Table 18. JFS2 filter tuple
Filter (fifth tuple) Description
File system mount path, for example: /usr The probe action is run only for the file system with the particular mount path. It must be a JFS2 file system, otherwiseProbeVue rejects that probe specification.

Consider following examples for the JFS2 file system probes:

@@io:jfs2:buf_map:*:/usr|/tmp

For Virtual file system (VFS) probes, the fifth tuple can have following values:

Table 19. VFS filter tuple
Filter (fifth Tuple) Description
File system mount path. For example, /tmp The probe action is run for files that belong to the file system.
File system type. The allowed symbols are JFS2, NAMEFS, NFS, JFS, CDROM, PROCFS, SFS, CACHEFS, NFS3, AUTOFS, POOLFS, VXFS, VXODM, UDF, NFS4, RFS4, CIFS, PMEMFS, AHAFS, STNFS, ASMFS The probe action is run for files of the particular file system. The symbols correspond to the AIX file systems defined in the exported header file sys/vmount.h.

Consider the following examples for the Virtual file system (VFS) probes:

@@io:vfs:entry:read:JFS2

@@io:vfs:exit:*:/usr|JFS

I/O probe related built-in variables for Vue scripts

__iobuf built-in variable

You can use the special __iobuf built-in variable to access various information about the I/O buffer that is employed in the current I/O operation. It is accessible in probes of sub types: disk, lvm, and bdev. Its member elements can be accessed by using the __iobuf->member syntax.

Note: Whenever the actual value cannot be obtained, the value that is marked as Invalid Value is returned. This value is returned because of one of the following reasons:
  • Page fault context is required, but the current probevctrl tunable value, num_pagefaults, is either 0 or not sufficient.
  • The memory location that is containing the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__iobuf built-in variable has the following members:

Table 20. The __iobuf built-in variable members
Member name Type Description Invalid Value
blknum unsigned long long Starting block number of the I/O request. 0xFFFFFFFFFFFFFFFF
bcount unsigned long long Requested number of bytes in the I/O operation. 0xFFFFFFFFFFFFFFFF
bflags unsigned long long The flags that are associated with the I/O operation. The following symbols are available: B_READ, B_ASYNC, B_ERROR. The symbols can be used along with the bflags value to see whether it is set. For example, if (__iobuf->bflags & B_READ) is true, then it is a read operation.
Note: There is no B_WRITE flag. If the B_READ flag is not set, it is considered to be write operation.
0
devnum unsigned long long The device number of the target device that is associated with the I/O operation. It has the device major number and minor number that is embedded in it. 0
major_num int The major number of the target device of the I/O operation. -1
minor_num int The minor number of the target device of the I/O operation. -1
error int In case of any error in the I/O operation, this value is the error number. This value is defined in the exported errno.h header file. -1
residue unsigned long long The remaining number of bytes from the original request that might not be read or written. On the I/O completion events, this value is ideally zero. But for read operation, a nonzero value might mean that you are trying to read more than what is available, which is acceptable. This value is considered only when error value is nonzero. 0xFFFFFFFFFFFFFFFF
bufid unsigned long long A unique number that is associated with the I/O request. While the I/O is in progress, the bufid value uniquely identifies the I/O request in all the events of a particular sub type. For example, in disk: entry, disk: iostart, disk: iodone, and disk:exit. If the __iobuf->bufid matches, it is the same I/O request at various stages). 0
parent_bufid unsigned long long If the value is not 0, this value provides the bufid of the upper layer buffer that is associated with this I/O request. You can now link the current I/O operation with the upper layer I/O request. For example, in a disk I/O request, the corresponding LVM I/O can be determined.
Note: The parent_bufid field is not set in all code paths, and hence it is not always useful. Use the child_bufid field to link I/O requests between two adjacent layers.
0
child_bufid unsigned long long If the value is not 0, this value provides the bufid of the new I/O request that is sent to the lower layer. The best events to record are disk:iostart, lvm:iostart, and bdev:iostart. You can identify the I/O in the lower adjacent layer by matching the __iobuf->bufid value to this child_bufid value. For example, in lvm:iostart, you can record the __iobuf->child_buf value. Then, in disk:entry, you can match it with __iobuf->bufid to identify the corresponding I/O request. 0

__file built-in variable

You can use the __file special built-in variable to get various information about file operation. It is available in probes of sub type VFS. Its member elements can be accessed by using the __file->member syntax.

Note: Whenever the actual value cannot be obtained, the value that is marked as invalid is returned. The invalid value is returned because of one of the following reasons:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location, which contains the value, is paged out.
  • Any other severe system error such as invalid pointer, or corrupted memory.

The __file built-in variable has the following members:

Table 21. The __file built-in variable members
Member name Type Description Invalid Value
f_type int Specifies the type of the file. It can match one of the following built-in constant values:
  • F_REG (regular file)
  • F_DIR (directory)
  • F_BLK (block device file)
  • F_CHR (character device file)
  • F_LNK (file link)
  • F_SOCK (socket)
Note: The value might not match any of the built-in constants because the list does not include every possible file type, but only the most useful ones.
-1
fs_type int Specifies the type of the file system to which this file belongs. It can match one of the following built-in constant values:
  • FS_JFS2
  • FS_NAMEFS
  • FS_NFS
  • FS_JFS
  • FS_CDROM
  • FS_PROCFS
  • FS_SFS
  • FS_CACHEFS
  • FS_NFS3
  • FS_AUTOFS
  • FS_POOLFS
  • FS_VXFS
  • FS_VXODM
  • FS_UDF
  • FS_NFS4
  • FS_RFS4
  • FS_CIFS
  • FS_PMEMFS
  • FS_AHAFS
  • FS_STNFS
  • FS_ASMFS

The built-in constants corresponds to the AIX file system types defined in the exported sys/vmount.h header file.

-1
mount_path char * Specifies the path where the associated file system is mounted. null string
devnum unsigned long long Specifies the device number of the associated block device of the file. Both the major and minor numbers are embedded in it. If there is no associated block device, then it is 0. 0
major_num int Specifies the major number of the associated block device of the file. -1
minor_num int Specifies the minor number of the associated block device of the file. -1
offset unsigned long long Specifies the current read/write byte offset of the file. 0xFFFFFFFFFFFFFFFF
rw_mode int Specifies the read/write mode of the file. It matches one of the built-in constant values: F_READ or F_WRITE. -1
byte_count unsigned long long At vfs: entry event, byte_count provides the byte count of the read or write request. At vfs: exit event, it provides the number of bytes that remained unfulfilled. For example, the difference of this value between these two events determines how many bytes were processed in the operation. 0xFFFFFFFFFFFFFFFF
fname char * Specifies the name of the file (only base name, not path). null string
inode_id unsigned long long Specifies a system-wide unique number that is associated with the file.
Note: It is different from file inode number.
0
path path_t (new data type in VUE) Specifies the complete file path. It can be printed by using printf() and the format specifier %p. null string as file path
error int If the read/write operation failed, the error number as defined in the exported errno.h header file. If there is no error, it is 0. -1

__lvol built-in variable

You can use the __lvol special built-in variable to get various information about the logical volume in an LVM operation. It is available in probes of sub type lvm. Its member elements can be accessed by using the __lvol->member syntax.
Note: Whenever the actual value cannot be obtained, the value, which is marked as Invalid Value, is returned. There might be following reasons for getting this invalid value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.
__lvol built variable in has following members:
Table 22. The __lvol built-in variable members
Member name Type Description Invalid Value
name char * The name of the logical volume. null string
devnum unsigned long long The device number of the logical volume. It has both major number and minor number that is embedded in it. 0
major_num int The major number of the logical volume. -1
minor_num int The minor number of the logical volume. -1
lv_options unsigned int The options that are related to the logical volume. The following values are defined as built-in constants:
  • LV_RDONLY (read-only logical volume)
  • LV_NOMWC (no mirror write consistency checking)
  • LV_ACTIVE_MWC (active mirror write consistency)
  • LV_PASSIVE_MWC (passive mirror write consistency)
  • LV_SERIALIZE_IO (I/O is serialized)
  • LV_DMPDEV (This LV is a dump device)

You can check whether one of these values is set by having condition such as __lvol->lv_options & LV_RDONLY.

Note: All possible values are not defined, and hence other options might be available in the value.
0xFFFFFFFF

__volgrp built-in variable

You can use __volgrp special built-in variable to get various information about the volume group in an LVM operation. It is available in probes of sub type lvm. Its member elements can be accessed by using the __volgrp->member syntax.
Note: Whenever the actual value cannot be obtained, the value that is marked as Invalid Value is returned. The value could be invalid because of the following reasons:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__volgrp built-in variable has following members:

Table 23. The __volgrp built-in variable members
Member name Type Description Invalid Value
name char * The name of the volume group. null string
devnum unsigned long long The device number of the volume group. It has major number and minor number that is embedded in it. 0
major_num int The major number of the volume group. -1
minor_num int The minor number of the volume group.
Note: For volume group, AIX always assigns 0 as the minor number.
-1
num_open_lvs int The number of open logical volumes that belong to this volume group. -1

__diskinfo built-in variable

You can use the __diskinfo special built-in variable to get various information about the disk in a disk I/O operation. It is available in probes of sub type disk. Its member elements can be accessed by using the __diskinfo->member syntax.
Note: Whenever the actual value cannot be obtained, the value that is marked as “Invalid Value” is returned. There might be following reasons for getting this value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.
__diskinfo built-in variable has following members:
Table 24. The __diskinfo built-in variable members
Member name Type Description Invalid Value
name char * The name of the disk. null string.
devnum unsigned long long The device number of the disk. It has major number and minor number that are embedded in it. 0
major_num int The major number of the disk. -1
minor_num int The minor number of the disk. -1
lun_id unsigned long long The Logical Unit Number (LUN) for the disk. 0xFFFFFFFFFFFFFFFF
transport_type int The transport type of the disk. It can match one of the following built-in constant values:
  • T_FC (Fibre Channel)
  • T_ISCSI (iSCSI)
  • T_VSCSI (Virtual SCSI)
  • T_SAS (Serial Attached SCSI)
-1
queue_depth int The queue depth of the disk. It indicates how many maximum simultaneous I/O requests that the disk driver can pass on to the lower layer (for example, adapter). If the number of incoming I/O requests is more than queue_depth, the request is handled differently. The extra request is handled by the disk driver in its wait queue until lower layer responds to at least one of the outstanding I/O requests. -1
cmds_out int Number of outstanding I/O command requests to the lower layer (for example, adapter). -1
path_count int Number of MPIO paths of the disk (Only if the disk is MPIO capable, else it is 0). -1
reserve_policy int The SCSI reservation policy of the disk. It matches one of the following built-in constant values:
  • DK_NO_RESERVE (no_reserve)
  • DK_SINGLE_PATH (single_path)
  • DK_PR_EXCLUSIVE (PR_exclusive)
  • DK_PR_SHARED (PR_shared)

Refer to AIX MPIO documentation to know more about the reservation policies.

-1
scsi_flags int The SCSI flags of the disk. The following built-in flag values are defined:
  • SC_AUTOSENSE_ENABLED (On error, target sends sense data in the response. Initiator needs not send request sense command.)
  • SC_NACA_1_ENABLED (Normal ACA is enabled and the target goes to ACA state if it is returning check condition.)
  • SC_64BIT_IDS (64-bit SCSI ID and logical unit number(LUN)
  • SC_LUN_RESET_ENABLED (LUN reset command can be sent.)
  • SC_PRIORITY_SUP (Device supports I/O priority.)
  • SC_CACHE_HINT_SUP (Device supports cache hints.)
  • SC_QUEUE_UNTAGGED (Device supports queuing of untagged commands.)
Note: All flag values are not defined, hence other flags present might be available in the value.
0

__diskcmd built-in variable

You can use the __diskcmd special built-in variable to get various information about the SCSI I/O command for the current operation. It is available in probes of sub type disk (but only iostart and iodone events). Its member elements can be accessed by using syntax __diskcmd->member.
Note: Whenever the actual value cannot be obtained, the value that is marked as “Invalid Value” is returned. There might be following reasons for getting value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__diskcmd built-in variable has following members:

Table 25. The __diskcmd built-in variable members
Member name Type Description
cmd_type int The type of the SCSI command (both type and subtype are merged together). The following built-in constant values are available as command type:
  • DK_BUF (normal I/O read/write)
  • DK_IOCTL (ioctl)
  • DK_REQSNS (Request sense)
  • DK_TGT_LUN_RST (target or LUN reset)
  • DK_TUR (Test unit ready)
  • DK_INQUIRY (Inquiry)
  • DK_RESERVE (SCSI-2 RESERVE, 6-byte version)
  • DK_RELEASE (SCSI-2 RELEASE, 6-byte version)
  • DK_RESERVE_10 (SCSI-2 RESERVE, 10-byte version)
  • DK_RELEASE_10 (SCSI-2 RELEASE, 10-byte version)
  • DK_PR_RESERVE (SCSI-3 Persistent Reserve, RESERVE)
  • DK_PR_RELEASE (SCSI-3 Persistent Reserve, RELEASE)
  • DK_PR_CLEAR (SCSI-3 Persistent Reserve, CLEAR)
  • DK_PR_PREEMPT (SCSI-3 Persistent Reserve, PREEMPT)
  • DK_PR_PREEMPT_ABORT (SCSI-3 Persistent Reserve, PREEMPT AND ABORT)
  • DK_READCAP (READ CAPACITY, 10-byte version)
  • DK_READCAP16 (READ CAPACITY, 16-byte version)
Note: The built-in constants are bit position values and hence their presence must be checked by using ‘&’ operator (the ‘==’ operator must not be used). For example: __diskcmd->cmd_type & DK_IOCTL.
retry_count int It indicates whether the I/O command is retried after any failure.
Note: The value of 1 means that it is the first attempt. Any larger value indicates actual retrials.
path_switch_count int It indicates how many times the path was changed for this particular I/O operation (usually indicates some I/O path failure, either transient or permanent).
status_validity int In case of any error, this value indicates whether it is a SCSI error or adapter error. It can match one of the following built-in constant values: SC_SCSI_ERROR or SC_ADAPTER_ERROR. If there is no error, then it is 0.
scsi_status int If the status_validity field is set to SC_SCSI_ERROR, this field gives more details about the error. It can match one of the built-in constant values:
  • SC_GOOD_STATUS (Task is completed successfully)
  • SC_CHECK_CONDITION (Some error, sense data provides more information)
  • SC_BUSY_STATUS (LUN is busy, cannot accept command)
  • SC_RESERVATION_CONFLICT (Violation of existing SCSI reservation.)
  • SC_COMMAND_TERMINATED (The device ended the command.)
  • SC_QUEUE_FULL (The device queue is full.)
  • SC_ACA_ACTIVE (The device is in Auto Contingent Allegiance state.)
  • SC_TASK_ABORTED (The device stopped the command.)
Note: All possible values are not defined. Hence, SC_SCSI_ERROR can have a value that might not match any of the built-in values. You can look up the corresponding SCSI command response code.
adapter_status int If the status_validity field is set to SC_ADAPTER_ERROR, this field provides more information about the error. It can match one of the following built-in constant values:
  • ADAP_HOST_IO_BUS_ERR (Host I/O bus error)
  • ADAP_TRANSPORT_FAULT (transport layer error)
  • ADAP_CMD_TIMEOUT (I/O command was timed out)
  • ADAP_NO_DEVICE_RESPONSE (no response from the device)
  • ADAP_HDW_FAILURE (adapter hardware failure)
  • ADAP_SFW_FAILURE (adapter microcode failure)
  • ADAP_TRANSPORT_RESET (adapter detected an external SCSI bus reset)
  • ADAP_TRANSPORT_BUSY (transport layer is busy)
  • ADAP_TRANSPORT_DEAD (transport layer is inoperative)
  • ADAP_TRANSPORT_MIGRATED (transport layer is migrated)
  • ADAP_FUSE_OR_TERMINAL_PWR (adapter blown fuse or bad electrical termination)

__iopath built-in variable

You can use the __iopath special built-in variable to get various information about the I/O path for the current operation. It is available in probes of sub type disk for iostart and iodone events only. Its member elements can be accessed by using the __iopath->member syntax .
Note: Whenever the actual value cannot be obtained, the value, which is marked as Invalid Value, is returned. There might be following reasons for getting this value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__iopath has following members:

Table 26. The __iopath built-in variable members
Member name Type Description Invalid Value
path_id int The ID of the current path (starting from 0). -1
scsi_id unsigned long long The SCSI ID of the target on this path. 0xFFFFFFFFFFFFFFFF
lun_id unsigned long long The Logical Unit Number (LUN) on this path. 0xFFFFFFFFFFFFFFFF
ww_name unsigned long long The worldwide name of the target port on this path. 0
cmds_out int The number of I/O commands outstanding on this path. -1

__j2info built-in variable

The __j2info is a special built-in variable that you can use to get various information about JFS2 file system operation. It is available in probes of sub type jfs2. Its member elements can be accessed by using the __j2info->member syntax.
Note: Whenever the actual value cannot be obtained, the value, which is marked as Invalid Value is returned. There might be following reasons for getting this value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__j2info has the following members:

Table 27. The __j2info built-in variable members
Member name Type Description Invalid Value
inode_id unsigned long long A system-wide unique number that is associated with the file of current operation.
Note: It is different from the file inode number.
0
f_type int Type of the file. The __file->f_type description provides possible values. -1
mount_path char * The path where the file system is mounted. null string.
devnum unsigned long long The device number of the underlying block device of the file system. It has both major number and minor number embedded. 0
major_num int The major number of the underlying block device of the file system. -1
minor_num int The minor number of the underlying block device of the file system. -1
l_blknum unsigned long long The logical block number for this file operation. 0xFFFFFFFFFFFFFFFF
l_bcount unsigned long long The requested byte count between the logical blocks in this operation. 0xFFFFFFFFFFFFFFFF
child_bufid unsigned long long The bufid of the I/O request buffer that is sent down to the lower layer (for example, LVM). In that layer, it appears as __iobuf->bufid. 0
child_blknum unsigned long long The block number of the I/O request buffer that is sent down to the lower layer (for example, LVM). In that layer, it appears as __iobuf->blknum. 0xFFFFFFFFFFFFFFFF
child_bcount unsigned long long The byte count of the I/O request buffer that is sent down to the lower layer (for example, LVM). In that layer, it appears as __iobuf->bcount. 0xFFFFFFFFFFFFFFFF
child_bflags unsigned long long The flags of the I/O request buffer that is sent down to the lower layer (for example, LVM). In that layer, it appears as __iobuf->bflags. 0

Example scripts for I/O probe manager

  1. Script to trace any write operation to the /etc/passwd file:
    
    int write(int, char *, int);
    @@BEGIN  {
            target_inodeid = fpath_inodeid("/etc/passwd");
    }
    @@syscall:*:write:entry {
            if (fd_inodeid(__arg1) == target_inodeid) {
                    printf("write on /etc/passwd: timestamp=%A, pid=%lld, pname=[%s], uid=%lld\n",
                            timestamp(), __pid, __pname, __uid);
            }
    }
    If the scripts is in a VUE file, names etc_passwd.e. The script can be run as:
    # probevue etc_passwd.e
    In another terminal, if the user (root) runs:
    # mkuser user1
    Then probevue displays an output similar to the following example:
    write on /etc/passwd: timestamp=Mar/03/15 16:10:07, pid=14221508, pname=[mkuser], uid=0
    
    
  2. Script to find the maximum and minimum I/O operation time for a disk (for example, hdisk0) in a period. Also, find the block number, requested byte count, time of operation and type of operation (read or write) corresponding to the maximum or minimum time.
long long min_time, max_time;
@@BEGIN {
        min_time = max_time = 0;
}
@@io:disk:entry:*:hdisk0 {
        ts_entry[__iobuf->bufid] = (long long)timestamp();
}
@@io:disk:exit:*:hdisk0 {
        if (ts_entry[__iobuf->bufid]) { /* only if we recorded entry time */
                ts_now = timestamp();
                op_type = (__iobuf->bflags & B_READ) ? "READ" : "WRITE";
                dt = (long long)diff_time(ts_entry[__iobuf->bufid], ts_now, MICROSECONDS);
                if (min_time == 0 || dt < min_time) {
                        min_time = dt;
                        min_blknum = __iobuf->blknum;
                        min_bcount = __iobuf->bcount;
                        min_ts = ts_now;
                        min_optype = op_type;
                }
                if (max_time == 0 || dt > max_time) {
                        max_time = dt;
                        max_blknum = __iobuf->blknum;
                        max_bcount = __iobuf->bcount;
                        max_ts = ts_now;
                        max_optype = op_type;
                }
                ts_entry[__iobuf->bufid] = 0;
        }
}
@@END {
        printf("Maximum and minimum IO operation time for [hdisk0]:\n");
        printf("Max: %lld usec, block=%lld, byte count=%lld, operation=%s, time of operation=[%A]\n",
                max_time, max_blknum, max_bcount, max_optype, max_ts);
        printf("Min: %lld usec, block=%lld, byte count=%lld, operation=%s, time of operation=[%A]\n",
                min_time, min_blknum, min_bcount, min_optype, min_ts);
}

Let this script be in a VUE file named disk_min_max_time.e. It can be executed as:
# probevue disk_min_max_time.e
Let there be some IO activity on hdisk0 (dd command can be used). 
Then after a few minutes, if the above command is terminated (by pressing CTRL-C), then it will print output similar to:
^CMaximum and minimum IO operation time for [hdisk0]:
Max: 48174 usec, block=6927976, byte count=4096, operation=READ, time of operation=[Mar/04/15 03:31:07]
Min: 133 usec, block=6843288, byte count=4096, operation=READ, time of operation=[Mar/04/15 03:31:03]

Network probe manager

Network probe manager tracks incoming and outgoing network packets in a system (packet information as interpret by the bpf module in AIX). Probe specification allows the user to specify Berkeley Packet Filter (BPF) filters, similar to tcpdump filter expression for granular tracking.

You can use built-in variables to collect packet header and payload information for Internet protocols. For example, Ethernet, Internet Protocol Version 4/Version 6 (IPv4/v6), Transmission Control Protocol (TCP), User Datagram Protocol (UDP), Internet Control Message Protocol (ICMP), Internet Group Message Protocol (IGMP), and Address Resolution Protocol (ARP) protocols.

Network probe manager reports critical protocol-specific events (TCP state changes, round-trip times, retransmissions, UDP buffer overflows).

The network probe manager addresses following primary use cases:

  • Provide the following packet-specific information according to the bpf module based on IP address and ports:
    • Track the incoming and outgoing bytes for a connection.
    • Use following built-ins to gather protocol header and payload information.
      • TCP flags (SYN, FIN), TCP sequence and acknowledgment number.
      • IPv4/IPv6 (IP addresses, protocol types: tcp, udp, icmp, igmp, and so on).
      • ICMP (packet type: ECHO REQUEST, ECHO RESPONSE, and so on).
  • Provide access to complete RAW network packet for probe script processing.
  • Report the following protocol-related events:
    • Track TCP sender and receiver buffer full events.
    • TCP connection state changes from SYN-SENT state to ESTABLISHED state or from ESTABLISHED state to CLOSE state.
    • Monitor delta time between state changes (for example, time that is taken from SYN-SENT state to ESTABLISHED state).
    • Identify the listener (connection information) that discarded connections because the listener's queue is full.
    • Identify retransmissions (second and further retransmission for a packet) for TCP connections.
    • Identify the UDP socket that dropped packets because of insufficient receiving buffer.

Probe specification

Probe specification for network probe manager contains three or five tuples that are separated by : (colon). First tuple is always @@net.

Network probe manager supports two major categories of specifications: One category gathers packet-specific information and another category gathers protocol-specific information.

  • Format to gather packet specific information:

    @@net:bpf:<interface1>|<interface 2>|…..:<protocol>:<Filter>

  • Format to gather protocol specific information

    @@net:tcp:<event_name>

    @@net:udp:<event_name>

Probe sub type

The second tuple signifies the sub type of the probe that indicates which layer of AIX network stack contains the probe. This tuple can have one of the following values (it cannot be *):

Table 28. Second tuple specification for probe sub type
Second Tuple (sub type) Description
bpf This probe starts at network interface layer when a packet matches the specific filter.
tcp This probe starts for TCP protocol-specific events.
udp This probe starts for UDP protocol-specific events.

Probe network event or gather network packet information

The third tuple is specific to particular sub type (specified in second tuple). It cannot have a value of *.

bpf-based probes

The specification contains 5 tuples for bpf-based probes that are described in the following table:

Table 29. bpf-based probes: Tuple specification
Second tuple (Sub type) Subsequent tuples Description

bpf

Third tuple: interface names This tuple specifies an interface or a list of interfaces for which the packet information can be captured. Possible values are enX (for example, en0,en1) and lo0. The * value is not supported for this tuple. You can specify one or more interfaces at a time by using | as delimiter.
Fourth tuple: protocol This tuple specifies the network protocol to start the probe. Possible values are ether, arp, rarp, ipv4, ipv6, tcp, udp, icmp4, icmp6 and igmp. Protocol-specific built-ins are populated for access in Vue script. For example, a protocol value of ipv4 populates __ip4hdr built-ins.
The * value for this tuple indicates that the probe starts for all protocol types that match the specified filter. When the protocol is *, none of the built-in values that are supported by network probe manager are available to Vue scripts. You can access the raw packet data of requested size by using the Vue function copy_kdata () and map to corresponding protocol headers.
Note: Specifying * as a value can be performance intensive as the probe is started for all incoming and outgoing packets on the specified interfaces that match the filter. There are also copies involved when the packet information is spanned across multiple packet buffers.
Fifth tuple: bpf filter string This tuple specifies the bpf filter expression (filter expressions as specified in tcpdump command). Filter expression must be provided in the double quotation marks. Filter expression and protocol that is specified in the fourth tuple must be compatible. The * value is not supported in this tuple.

Refer to tcpdump documentation for detailed information on filter expressions.

Examples
  1. Specification format to access the built-in variables that are related to Ethernet header (__etherhdr), IP header(__ip4hdr) or (__ip6hdr), and TCP header (__tcphdr) information from the Vue script when interface en0 receives or sends packet on port 23 (filter string ” port 23”):
    @@net:bpf:en0:tcp:“port 23”
  2. Specification format to access the built-in variables related to Ethernet header(__etherhdr), IP header(__ip4hdr or __ip6hdr), and UDP header (__udphdr) information from the Vue script when system receives or sends packet from host example.com (filter string “example.com”) on en0 and en1 interfaces:
    @@net:bpf:en0|en1:udp:“host example.com”
  3. Specification format to access the raw packet information when system receives or sends packet from or to "host example.com":
    @@net:bpf:en1:*:“host example.com”
Note: Each bpf probe specification uses a bpf device. These devices are shared by ProbeVue, tcpdump, and any other application that uses the libpcap or bpf services for packet capture and injection. The number of bpf probes depends on the number of available bpf devices in the system.

When a bpf probe is started, the __mdata variable contains the raw packet data. You can access the raw data of requested size by using the Vue function copy_kdata () and map to the ether_header, ip header, and so on. Use the following structures to find out the header and payload data information.

Example

VUE script to access the raw packet data when the “*” is specified as the protocol.

 /* Define the ether header structure */
struct  ether_header {
        char  ether_dhost[6];
        char  ether_shost[6];
        short ether_type;
};

/* ProbeVue script to access and interpret the data from RAW packet */

@@net:bpf:en0:*:"port 23"
{
        /* define the script local variables */
        __auto struct ether_header eth;
        __auto char *mb;

        /* __mdata contains the address of packet data */
        mb =(char *) __mdata;
        printf("Network probevue\n");
        
       /* 
        * Use already available “copy_kdata(…)” VUE function to copy data of 
        * requested size (size of ether_header) from mbuf data pointer to eth
        * (ether_header) variable.
        */
        copy_kdata (mb, eth);
        printf("Ether Type  from raw data :%x\n",eth.ether_type);
      
}

TCP probes

The specification contains three tuples for TCP probes as described in the following table:

Table 30. TCP probes: Tuple specification
Second tuple (Sub type)

Events (Third tuple)

The * value is not supported in this tuple.

Description
tcp state_change This probe is started whenever the TCP state changes.
send_buf_full This probe is started whenever the send buffer full event occurs.
recv_buf_full This probe is started whenever the receive buffer full event occurs.
retransmit This probe is started whenever the re-transmission of packet happens for TCP connection.
listen_q_full This probe is started whenever a server (listener socket) discards the new connection requests due to listener’s queue being full.

__proto_info built-in variable provides the TCP connection (four tuple) information (local IP, remote IP, local port, and remote port) whenever the TCP-related event occurs. Remote port and IP address contains a value of NULL for the listen_q_full event.

Example

Probe specifications for TCP protocol state changes:

@@net:tcp:state_change

udp probes

For udp probes the specification contains three tuples as described in the following table:

Table 31. udp second tuple: Third tuple values
Second tuple (Sub type) Events (third tuple)

The * value is not supported in this tuple.

Description
udp sock_recv_buf_overflow This probe is started whenever the datagram or the UDP socket’s receive buffer overflows.

The __proto_info built-in variable provides the UDP protocol related data (source IP and destination IP addresses, source and destination port numbers) whenever socket receive buffer overflow event occurs.

@@net:udp:sock_recv_buf_overflow

Example

Probe specifications for UDP socket’s receive buffer overflow:

@@net:udp:sock_recv_buf_overflow

Network probe-related built-in variables for Vue scripts

Network related events can be probed using following built-in variables.

__etherhdr built-in variable

The __etherhdr variable is a special built-in variable to get ether header information from filtered packet. This built-in variable is available when you probe the packet information at interface layer with any one of these protocols: “ether”, “ipv4”, “ipv6”, “tcp”, “udp”, “icmp4”, icmp6”, “igmp”, “arp”, and “rarp”. This variable is available in probes of sub type bpf. Its member elements can be accessed by using the syntax __etherhdr->member.

The __etherhdr built-in value has the following members:

Table 32. The __etherhdr built-in variable members
Member name Type Description
src_addr

mac_addr_t

Source MAC address.

The data type mac_addr_t is used to store the MAC address. Use format specifier “M” to print the MAC address.

dst_addr mac_addr_t Destination MAC address.

The data type mac_addr_t is used to store the MAC address. Use format specifier “M” to print the MAC address.

ether_type unsigned short This name indicates the protocol encapsulated in the payload of an Ethernet frame. Protocols can be IPv4, IPv6, ARP, and REVARP.

It can match one of the following built-in constant values for ether_type:

  • ETHERTYPE_IP
  • ETHERTYPE_IPV6
  • ETHERTYPE_ARP
  • ETHERTYPE_REVARP

Refer the header files /usr/include/netinet/if_ether.h and /usr/include/netinet/if_ether6.h for ether_type values.

Note: The __etherhdr built-in variable is applicable only for Ethernet interfaces and not for loopback interfaces.
__ip4hdr built-in variable

The __ip4hdr variable is a special built-in variable to get the IPv4 header information from filtered packet. This variable is available when you probe the packet information at interface layer with any one of the protocols: “ipv4”,“tcp”, “udp”, “icmp4”, and “igmp”. And, it has valid data when IP version is IPv4. This variable is available in probes of sub type bpf. Its member elements can be accessed by using the syntax __ip4hdr->member.

This built-in variable has the following members:

Table 33. The __ip4hdr built-in variable members
Member name Type Description
src_addr

ip_addr_t

Source IP address.

The data type ip_addr_t is used to store the IP address. Use format specifier “I” to print the IP address in dotted decimal format and use format specifier “H” to print the host name. Host name printing is a costly operation.

dst_addr ip_addr_t Destination IP address.

The data type ip_addr_t is used to store the IP address. Use format specifier “I” to print the IP address in dotted decimal format and use format specifier “H” to print host name. Host name printing is a costly operation.

protocol unsigned short This member name indicates the protocol that is used in the data portion of the IP datagram. Protocols can be TCP, UDP, ICMP, IGMP, FRAGMENTED, and so on.

It can match one of the following built-in constant values for protocol.

IPPROTO_HOPOPTS,
IPPROTO_ICMP, 
IPPROTO_IGMP,
IPPROTO_TCP, 
IPPROTO_UDP, 
IPPROTO_ROUTING,
IPPROTO_FRAGMENT, 
IPPROTO_NONE, 
IPPROTO_LOCAL

Refer the header file /usr/include/netinet/in.h for protocol values.

ttl unsigned short Time to live or hop limit.
cksum unsigned short IP header checksum.
id unsigned short Identification number. This member is used for uniquely identifying the group of fragments of a single IP datagram.
total_len unsigned short Total length. This value is entire packet (fragment) size, including IP header and data in bytes.
hdr_len unsigned short Size of the IP header.
tos unsigned short Type of service.
frag_offset unsigned short Fragment offset.

This value specifies the offset of particular fragment, relative to beginning of the original un fragmented IP datagram. The first fragment has an offset of zero.

It can match one of the built-in constant frag_offset flag values. The flag values must be bitwise and with the built-in constant flag value to validate the presence of the particular flag:

  • IP_DF (No fragment flag)
  • IP_MF (more fragments flag)

Refer the header file /usr/include/netinet/ip.h for flag values.

__ip6hdr built-in variable

The __ip6hdr variable is a special built-in variable to get the IPv6 header information from filtered packet. This variable is available when user probes the packet information at interface layer. This variable with any one of the protocols (“ipv6”, “tcp”, “udp” and “icmp6”) has valid data when IP version is IPv6. This variable is available in probes of sub type bpf. Its member elements can be accessed by using the syntax __ip6hdr->member.

This built-in variable has the following members:

Table 34. The __ip6hdr built-in variable members
Member name Type Description
src_addr

ip_addr_t

Source IP address.

The data type ip_addr_t is used to store the IP address. Use format specifier “I” to print the IP address and use format specifier “H” to print the host name. Host name printing is a costly operation.

dst_addr ip_addr_t Destination IP address.

The data type ip_addr_t is used to store the IP address. Use format specifier “I” to print the IP address and use format specifier “H” to print host name. Host name printing is costly operation.

protocol unsigned short This value indicates the protocol that is used in the data portion of the IP datagram. Protocols can be TCP, UDP, and ICMPV6, and so on.

It can match one of the following built-in constant values for protocol:

IPPROTO_TCP,IPPROTO_UDP, IPPROTO_ROUTING, IPPROTO_ICMPV6, IPPROTO_NONE, IPPROTO_DSTOPTS, IPPROTO_LOCAL

Refer the header file /usr/include/netinet/in.h for protocol values.

hop_limit unsigned short Hop limit (time to live).
total_len unsigned short Total length (payload length). The size of the payload including any extension headers.
next_hdr unsigned short Specifies the type of the next header. This field usually specifies the transport layer protocol that is used by a packet's payload. When extension headers are present in the packet, this field indicates which extension header follows. The values are shared with those used for the IPv4 protocol field.
flow_label unsigned int Flow label.
traffic_class unsigned int Traffic class.

__tcphdr built-in variable

The __tcphdr variable is a special built-in variable to get the tcp header information from filtered packet. This variable is available when you probe the packet information at interface layer with tcp protocol. It is available in probes of sub type bpf. Its member elements can be accessed by using the syntax __tcphdr->member.

The __tcphdr built-in variable has the following members:

Table 35. The __tcphdr built-in variable members
Member name Type Description
src_port unsigned short Source port of the packet.
dst_port unsigned short Destination port of the packet.
flags unsigned short These values are the control bits and are set to indicate the communication of control information. 1 bit for each flag.

It can match one of the built-in constant flag values. The flag values must be bitwise and with the built-in constant flag value to validate the presence of the particular flag.

  • TH_FIN (No more data from sender)
  • TH_SYN (request to establish the connection)
  • TH_RST (Reset the connection)
  • TH_PUSH (Push function. Asks to push the buffered data to the receiving application)
  • TH_ACK (Indicates that this packet contains acknowledgment)
  • TH_URG (Indicates that the urgent pointer field is significant)

Refer TCP documentation for detailed information about these flags and refer the header file /usr/include/netinet/tcp.h for flag values.

seq_num unsigned int Sequence number.
ack_num unsigned int Acknowledgment number.
hdr_len unsigned int TCP header length information
cksum unsigned short Checksum.
window unsigned short Window size.
urg_ptr unsigned short Urgent pointer.

__udphdr built-in variable

The __udphdr is a special built-in variable that is used to get the udp header information from filtered packet. This built-in is available when user probes the packet information at interface layer with udp as protocol. It is available in probes of sub type bpf. Its member elements can be accessed by using the syntax __udphdr->member.

__udphdr built-in variable has the following members:

Table 36. The __udphdr built-in variable members
Member name Type Description
src_port unsigned short Source port of the packet.
dst_port unsigned short Destination port of the packet.
length unsigned short UDP header and data length information.
cksum unsigned short Checksum.

__icmp built-in variable

The __icmp is a special built-in variable that is used to get the icmp header information from filtered packet. This built-in is available when user probes the packet information at interface layer with icmp protocol. It is available in probes of sub type bpf. Its member elements can be accessed by using the syntax __icmp->member.

This built-in variable has the following members:

Table 37. The __icmp built-in variable members
Member name Type Description
type unsigned short Type of ICMP message.

For example: 0 - echo reply, 8 - echo request, 3 - destination unreachable. Look in for all the types. For more information, refer to the standard network documentation.

It can match one of the following built-in constant values for of ICMP message types:

ICMP_ECHOREPLY, 
ICMP_UNREACH
ICMP_SOURCEQUENCH,
ICMP_REDIRECT,
ICMP_ECHO, 
ICMP_TIMXCEED,
ICMP_PARAMPROB,
ICMP_TSTAMP,
ICMP_TSTAMPREPLY,
ICMP_IREQ, 
ICMP_IREQREPLY,
ICMP_MASKREQ,
ICMP_MASKREPLY

Refer the header file /usr/include/netinet/ip_icmp.h for protocol values.

Note: All possible message type values are not defined, and hence there can be other options present in the value.
code unsigned short Subtype of ICMP message.

For each type of message, several different codes and subtypes are defined. For example, no route to destination, communication with destination administratively prohibited, not a neighbor, address unreachable, port unreachable. For more information, refer to the standard network documentation.

It can match one of the following built-in constant values for ICMP sub types:

ICMP_UNREACH_NET ICMP_UNREACH_HOST ICMP_UNREACH_PROTOCOL ICMP_UNREACH_PORT ICMP_UNREACH_NEEDFRAG ICMP_UNREACH_SRCFAIL ICMP_UNREACH_NET_ADMIN_PROHIBITED ICMP_UNREACH_HOST_ADMIN_PROHIBITED

Subtype values for type 4

The subtype values for type 4 are as follows:

ICMP_REDIRECT_NET 
ICMP_REDIRECT_HOST 
ICMP_REDIRECT_TOSNET 
ICMP_REDIRECT_TOSHOST

Subtype values for type 6

The subtype values for type 6 are as follows:

ICMP_TIMXCEED_INTRANS
ICMP_TIMXCEED_REASS

Subtype values for type 7

The subtype values for type 7 are as follows:

ICMP_PARAMPROB_PTR

ICMP_PARAMPROB_MISSING

Refer the header file /usr/include/netinet/ip_icmp.h for message subtype values.

Note: Not all possible message sub types values are defined, and hence there might be other options present in the message sub type value.

cksum unsigned short Checksum.

__icmp6 built-in variable

__icmp6 is a special built-in variable that is used to get the icmpv6 header information from filtered packet. This is available when user probes the packet information at interface layer with icmp6 protocol. It is available in probes of sub type bpf. Member elements of this built-in variable can be accessed using syntax “__icmp6->member”.

__icmp6 has the following members:

Table 38. The __icmp6 built-in variable members
Member name Type Description
type unsigned short Type of ICMPV6 message.

This specifies the type of message, which determines the format of the remaining data.

It can match one of the following built-in constant values for ICMPV6 types.

ICMP6_DST_UNREACH
ICMP6_PACKET_TOO_BIG
ICMP6_TIME_EXCEEDED
ICMP6_PARAM_PROB
ICMP6_INFOMSG_MASK
ICMP6_ECHO_REQUEST
ICMP6_ECHO_REPLY 

Refer the header file /usr/include/netinet/icmp6.h for protocol values.

Note: Not all possible message type values are defined, and hence there might be other options present in the value.
code unsigned short Subtype of ICMPV6 message.

This value depends on the message type. It provides an extra level of message granularity.

It can match one of the following built-in constant values for ICMPV6 sub types.

ICMP6_DST_UNREACH_NOROUTE
ICMP6_DST_UNREACH_ADMIN
ICMP6_DST_UNREACH_ADDR
ICMP6_DST_UNREACH_BEYONDSCOPE
ICMP6_DST_UNREACH_NOPORT

Refer the header file /usr/include/netinet/icmp6.h for message subtype values.

Note: Not all possible message sub type values are defined, and hence there might be other options present in the value.
cksum unsigned short Checksum.

__igmp built-in variable

__igmp is a special built-in variable that is used to get the igmp header information from filtered packet. This is available when user probes the packet information at interface layer with igmp protocol. This is available in probes of sub type bpf. Its member elements can be accessed using syntax “__igmp->member”.

__igmp built-in has the following members:

Table 39. The __igmp built-in variable members
Member name Type Description
type unsigned short

Type of IGMP message.

For example: Membership Query (0x11), Membership Report (IGMPv1: 0x12, IGMPv2: 0x16, IGMPv3: 0x22), Leave Group (0x17) For more information, refer to the standard or Network documentation.

It can match one of the following built-in constant values for IGMP Message types.

IGMP_HOST_MEMBERSHIP_QUERY
IGMP_HOST_MEMBERSHIP_REPORT
IGMP_DVMRP
IGMP_HOST_NEW_MEMBERSHIP_REPORT
IGMP_HOST_LEAVE_MESSAGE
IGMP_HOST_V3_MEMBERSHIP_REPORT
IGMP_MTRACE
IGMP_MTRACE_RESP
IGMP_MAX_HOST_REPORT_DELAY

Refer the header file /usr/include/netinet/igmp.h for protocol values.

Note: Not all possible message type values are defined, and hence there could be other options present in the value.

code unsigned short

Subtype of IGMP type.

It can match one of the following built-in constant values for IGMP Message subtypes.

Subtype values for type no 3.


DVMPP_PROBE           1
DVMRP_REPORT          2
DVMRP_ASK_NEIGHBORS   3
DVMRP_ASK_NEIGHBORS2  4
DVMRP_NEIGHBORS       5
DVMRP_NEIGHBORS2      6
DVMRP_PRUNE           7
DVMRP_GRAFT           8
DVMRP_GRAFT_ACK       9
DVMRP_INFO_REQUEST    10
DVMRP_INFO_REPLY      11
Note: Not all possible message sub type values are defined, and hence there could be other options present in the value.
cksum unsigned short IGMP Checksum value.
group_addr ip_addr_t

Group address that is reported or queried.

This address is the multicast address that is queried when you are sending a Group-Specific or Group-and-Source-Specific Query. The field has a value of zero when you are sending a General Query.

The data type ip_addr_t is used to store the group IP address. Use format specifier “I” to print the IP address.

__arphdr built-in variable

The __arphdr variable is a special built-in variable that is used to get the arphdr header information from filtered packet. This variable is available when user probes the packet information at interface layer with arp or rarp protocol. It is available in probes of sub type bpf.The __arphdr member elements can be accessed by using the syntax __arphdr->member.

The __arphdr built-in variable has following members:

Table 40. The __arphdr built-in variable members
Member name Type Description
hw_addr_type unsigned short Format of the hardware address type. This field identifies the specific data-link protocol that is being used.

It can match one of the following built-in constant values for data link protocol:

ARPHRD_ETHER, 
ARPHRD_802_5, 
ARPHRD_802_3, and 
ARPHRD_FDDI

Refer the header file /usr/include/net/if_arp.h for protocol values.

protocol_type unsigned short Format of the protocol address type. This field identifies the specific network protocol that is being used.

It can match one of the following built-in constant values for network protocol:

SNAP_TYPE_IP, 
SNAP_TYPE_AP, 
SNAP_TYPE_ARP,
VLAN_TAG_TYPE

Refer the header file /usr/include/net/nd_lan.h for the protocol values.

hdr_len unsigned short Mac or hardware address length.
proto_len unsigned short Protocol or IP address length.
operation unsigned short Specifies the operation that the sender is performing: 1 for request, 2 for reply.

It can match one of the following built-in constant values for network protocol:

ARPOP_REQUEST,
ARPOP_REPLY 

Refer the header file /usr/include/net/if_arp.h for protocol values.

src_mac_addr mac_addr_t Sender or source MAC address.

Sender hardware or mac address is stored in mac_addr_t data type. The format specifier “%M” is used to print sender MAC or hardware address.

dst_mac_addr mac_addr_t Target or Destination MAC address.

Target hardware or MAC address is stored in mac_addr_t data type. The format specifier “%M” is used to print target MAC or hardware address.

src_ip ip_addr_t Source or sender IP address.

Sender IP address is stored in ip_addr_t data type.

The format specifier “%I” is used to print sender IP address.

dst_ip ip_addr_t Target or Destination IP address.

Target IP address is stored in ip_addr_t data type.

The format specifier “%I” is used to print target IP address.

Example

Vue script to probe packet header information for packets received or sent over port 23. Provides the source and destination node information and also tcp header length information

@@net:bpf:en0:tcp:"port 23"
{
        printf("src_addr:%I and dst_addr:%I\n",__ip4hdr->src_addr,__ip4hdr->dst_addr);
        printf("src port:%d\n",__tcphdr->src_port);
        printf("dst port:%d\n",__tcphdr->dst_port);
        printf("tcp hdr_len:%d\n",__tcphdr->hdr_len);
}

Output:
# probevue bpf_tcp.e
src_addr:10.10.10.12 and dst_addr:10.10.18.231
src port:48401
dst port:23
tcp hdr_len:20
..................
.................

__proto_info built-in variable

The __proto_info variable is a special built-in variable that is used to get the protocol (source and destination IP addresses and ports) information for TCP or UDP events. The __proto_info variable is available in probes of sub type tcp or udp. Its member elements can be accessed by using the syntax __proto_info->member.

The __proto_info built-in variable has the following members:

Table 41. The __proto_info built-in variable members
Member name Type Description
local_port unsigned short Local port
remote_port unsigned short Remote port
local_addr ip_addr_t Local address
remote_addr ip_addr_t Remote address

Additional information for TCP-specific events

The TCP state change events are described in the following table:

Table 42. TCP state change events
Name Type Description
__prev_state short Previous state information for connection.
__cur_state short Present state information for connection.
It can match one of the following built-in constant values for TCP states:
  • TCPS_ESTABLISHED (connection established)
  • TPCS_CLOSED (Connection closed)
  • TPCS_LISTEN (Listening for connection)
  • TPCS_SYN_SENT (Sent SYN to remote end)
  • TCPS_SYN_RECEIVED (Received SYN from remote end)
  • TCPS_CLOSE_WAIT (Received Fin, waiting for close)
  • TCPS_FIN_WAIT_1 (are closed, sent fin)
  • TCPS_CLOSING (closed exchanged FIN, await FIN ACK)
  • TCPS_LAST_ACK (Had Fin and close , Await FIN ACK)
  • TCPS_FIN_WAIT_2 (are closed , Fin is Acked)
  • TCPS_TIME_WAIT (in 2*msl quiet wait after close)

The values are defined in exported header file /usr/include/netinet/tcp_fsm.h.

Example:

The following Vue script provides state change information for a particular connection:

@@net:tcp:state_change
when(__proto_info->local_addr ==”10.10.10.1” and __proto_info->remote_addr == 10.10.10.2”
	and __proto_info->local_port =”8000” and  __proto_info->remote_port =”9000”)
{
	printf(“Previous state:%d and current_state:%d\n”,__prev_state,__cur_state);
}

TCP retransmit event

Table 43. TCP retransmit event
Name Type Description
__nth_retransmit unsigned short Nth retransmission

Examples

1. Following example Identifies the listener which has discarded connections due to listener's queue is full.

@@net:tcp:listen_q_full
{
	printf(“Listener IP address:%I and Port number is:%d\n”,__proto_info->local_addr, __proto_info->local_port);
}

2. Following example Identifies connection which drop packets due to socket buffer overflows

@@net:udp:sock_recv_buf_overflow
{
        printf("Connection information which drops packet due to socket buffer overflows:\n"); 
        printf("Local IP address:%I and Remote IP address:%I\n",__proto_info->local_addr,__proto_info->remote_addr);
        printf("local port :%d and remote port:%d\n",__proto_info->local_port, __proto_info->remote_port);
 }

3. Identify retransmissions (second & further retransmission for a packet) for TCP connections for particular connection.

@@net:tcp:retransmit
when (__proto_info->local_addr == "10.10.10.1" &&
 	__proto_info->remote_addr == "10.10.10.2" && 
__proto_info->local_port == "4000" &&
 	__proto_info->remote_port == "5000")
{
        printf(" %d th re-transmition for this connection\n", _nth_retransmit);
}

4. Identify the connection information whenever sender buffer full event occurs .

@@net:tcp:send_buf_full
{
        printf("Connection information whenever send buffer full event occurs:\n"); 
        printf("Local IP address:%I and Remote IP address:%I\n",__proto_info->local_addr,__proto_info->remote_addr);
        printf("local port :%d and remote port:%d\n",__proto_info->local_port, __proto_info->remote_port);
 }

Sysproc probe manager

Overview

The sysproc probe manager provides an infrastructure to users and administrators to dynamically trace process or thread related data without knowing internals of sysproc subsystem.

The aspects of sysproc subsystem for a user or administrator is divided into the following main categories:

  • Process (or thread) creation or termination
  • Signal generation and delivery
  • Scheduler and dispatcher events
  • DR and CPU binding events

Process (or thread) creation or termination

Information related to how a process or thread is created and destroyed is required to a system administrator to administer the resources of the system. The sysproc probe manager addresses the following important use-cases:
  • Did a process exit naturally or because of an error?
  • When a process or thread got created or terminated or exceed?
  • How long did a process run?
  • Track events when a thread receives or returns from an exception.

Signal generation and delivery

Signals decide the current state of a processor thread in a system. To understand misbehaving process or threads an administrator uses the state of signals and the current state of processes due to these signals. The important use-cases under signal generation and delivery category (but not limited) addressed by this probe manager follow:
  • Signal source and signal information for a specific target.
  • Signal delivery of asynchronous signals.
  • Trace signal clears.
  • Trace events when a signal handler other than default is installed.
  • Signal target and signal information for a specific source.
  • Trace signal handler entry or exit.

Scheduler and dispatcher events

Scheduler and dispatcher dictate how a process or thread runs in the system. Administrator analyzes system performance by using dynamic trace scheduler or dispatcher subsystem.

The dynamic trace scheduler or dispatcher subsystem helps discover the reasons for retention of threads.

Following are the important use-cases under scheduler and dispatcher events category (but not limited) addressed by sysproc probe manager.
  • Trace thread or threads that are enqueued or dequeued from the run queue.
  • Trace events when any thread in the system is preempted.
  • Trace when a thread is being put to sleep over an event.
  • Trace when a sleeping thread is being woken up.
  • Track dispatches latency of a thread.
  • Track virtual processor folding events.
  • Trace change in any kernel thread priority.

Dynamic Reconfiguration (DR), and CPU binding events

This class of probes offer dynamic tracing capabilities to a user who tracks resources bound to a process.

Some of important use-cases (but not limited) under this category that is addressed by DR and CPU binding events probe manager follow:
  • Track when a thread binding changes from one CPU to another.
  • Track when the resources are attached or detached to a process.
  • Track CPU binding events.
  • Track start or end of a DR event.

Probe specification

The following format must be used in a Vue script to probe sysproc events:

@@sysproc:<sysproc_event>:<pid/tid/*>

First tuple @@sysproc indicates that this probe is specific to sysproc events.

Second tuple specifies the event to be probed.

Third tuple acts as a filter to isolate events that are specified through second tuple based on process or kernel thread id.
Note: Use of process or kernel thread id as filter in sysproc probes does not guarantee the event to occur in process or thread context. Sysproc probe manager uses process or thread id only as a filter. These events might be useful from a process or thread perspective despite the execution context of the probe event.

Signal send event, where either the process that is sending the signal or the one receiving it, can be useful. The following information specifies the appropriate filters for such probe events.

Probe points (events of interest)

A brief description of all events that can be probed through the sysproc probe manager is mentioned in the following table:

Table 44. Sysproc probe events
Probe (sysproc_event) Description
forkfail Track failures in fork interface.
execfail Track failures in exec interface.
execpass Track exec success.
exit Track exit of a process.
threadcreate Track creation of a kernel thread.
threadterminate Track termination of a kernel thread.
threadexcept Track process exceptions.
sendsig Track signal sent to a process by external sources.
sigqueue Tracks signals queued to a process
sigdispose Tracks signal disposals.
sigaction Track signal handler installations and reinstallations
sighandlestart Track when a signal handler is about to be called.
sighandlefinish Track when a signal handler completion
changepriority Track when priority of a process changes
onreadyq Track when a kernel thread gets on a ready queue.
offreadyq Track when a kernel thread is moved out of ready queue.
dispatch Track when the system dispatcher is called to schedule a thread
oncpu Track when a kernel thread acquires CPU.
offcpu Track when a kernel thread relinquishes CPU.
blockthread Track when a thread is blocked from getting CPU.
foldcpu Track folding of a CPU core.
bindprocessor Track event when a process/thread is bound to a CPU
changecpu Track events when a kernel thread changes CPU temporarily
resourceattach Track events when a resource is attached to another
resourcedetach Track events when a resource is detached from another
drphasestart Track when a drphase is getting initiated
drphasefinish Track when a drphase completes

Method to access data at a probe-point

ProbeVue allows data access through built-in variables.

built-ins values are of three types categorized based on accessibility:
  1. Accessible at any probe point, irrespective of the probe manager. For example: __curthread.
  2. Accessible throughout probes of a specific probe manager.
  3. Accessible only at defined probes (events of interest)
The sysproc probe manager allows access of data through built-ins of type (1) and (3). The following table is used to indicate accessibility of built-ins of type (1). Special built-ins that are provided for sysproc probe manager are of type long long.

Following are the list of built values of type (1).

  • __trcid
  • __errno__kernelmode
  • __arg1 to __arg7
  • __curthread
  • __curproc
  • __mst
  • __tid
  • __pid
  • __ppid
  • __pgid
  • __uid
  • __euid
  • __ublock
  • __execname
  • __pname

The built-in variables are also classified as context specific and context independent. Context-specific built-ins provide data based on the execution context of the probe.

AIX kernel operates in thread or interrupt context. Context-specific probes produce correct result when probe is started at thread or process context.

Results that are obtained from context-specific built-ins in interrupt execution context might be unexpected. Context-independent built-ins do not depend on the execution context and can be accessed safely irrespective of probe execution environment.

Table 45. Context specific and independent built-in variables
Context specific built-in variables Context independent built-in variables
__curthread __trcid
__curproc __errno
__tid __kernelmode
__pid __arg1 to __arg7
__ppid __mst
__pgid  
__uid  
__euid  
__ublock  
__pname  
__execname  

Probe points

Probe points are the specific events for which a probe is fired. Following are the list of probe points.

forkfail

The forkfail probe starts when fork fails. This probe determines the reasons of fork failure.

Syntax: @@sysproc:forkfail:<pid/tid/*>

Special built-in supported

__forkfailinfo
{ 
fail_reason; 
}

The fail_reasonvariable has one of the following values:

Table 46. fail_reason probe: Failure reasons
Reason Description
FAILED_RLIMIT Failed due to rlimit limitations
FAILED_ALLOCATIONS Failed due to internal resource allocations
FAILED_LOADER Failed at a loader stage
FAILED_PROCDUP Failed at procdup

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid, __ublock, __execname, __pname.

Execution environment

Runs in process environment.

Example

Following example shows how to monitor all fork failures because of rlimit in the system.

@@BEGIN 
{ 
         x = 0; 
} 

@@sysproc:forkfail:* 
        when (__forkfailinfo->fail_reason == FAILED_RLIMIT) 
{ 
                printf ("process %s with pid %llu failed to fork a child\n",__pname,__pid); 
                x++; 
} 
 
@@END 
{ 

        printf ("Found %d failures during this vue session\n",x); 
} 

execfail

The execfail probe starts when a exec function call fails. Use the execfail probe to determine the reasons for the failure.

Syntax: @@sysproc:execfail:<pid/tid/*>

Table 47. execfail probe: Failure reasons
Reason Description
FAILED_PRIVILEGES New process failed to acquire or inherit privileges
FAILED_COPYINSTR New process failed to copy instruction
FAILED_V_USERACC New process failed to discard v_useracc regions
FAILED_CLEARDATA Failed during clearing data for new process
FAILED_PROCSEG Failed to establish process private segment
FAILED_CH64 Failed to convert to a 64-bit process
FAILED_MEMATT Failed to attach to a memory resource set
FAILED_SRAD Failed to attach to a srad
FAILED_MSGBUF Error message buffer length is zero
FAILED_ERRBUF Failed to allocate error message buffer
FAILED_ENVAR Failed to allocate environment variables
FAILED_CPYSTR Copy string error
FAILED_ERRBUFCPY Failed to copy the error messages from errmsg_buf
FAILED_TOOLNGENV Env too long for allocated memory
FAILED_USRSTK Failed to setup user stack
FAILED_CPYARG Failed to copy arglist to stack
FAILED_INITPTRACE Failed to init ptrace
Note: 64 is added to error value if loader error is encountered.

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid, __ublock, __execname, __pname.

Execution environment

Runs in process environment.

exit

This probe starts when a process exits. Exit is also a system call manager and is traced through system call probe manager. Probing exit system call through sysproc probe manager explains nature and reasons of exit. It also explains reasons for a user thread termination in kernel space and not returned to user space.

Syntax: @@sysproc:forkfail:<pid/tid/*>

A program can exit because of the following reasons:

  • On reaching a terminal condition when a user space program cannot proceed further.
  • On receiving a terminal signal.

Special built-in supported

__exitinfo{
	signo;	      	
	returnval;	    
	iscore;	     		
}

Where, signo value signifies the signal number that caused process termination, returnval is the value that is returned by exit. Nonzero signo is valid only if the program is stopped by a signal.

The iscore variable is set when a core is generated as a result of process exit.

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid, __ublock, __execname, __pname.

Execution Environment

Runs in process environment.

Example

Following example explains how to probe exit event

echo '@@sysproc:exit:* { printf (" %s    %llu    %llu\n", __pname, __pid,__exitinfo->returnval);}' | probevue

Which will produce an output similar to the following.

 ksh    5833042    0 
 telnetd    7405958    1 
 dumpctrl    7405960    0 
 setmaps    7275006    0 
 termdef    7274752    0 
 hostname    7274754    0 
 id    8257976    0 
 id    8257978    0 
 uname    8257980    0 
 expr    8257982    1 

threadcreate

threadcreate probe starts when a thread is created successfully.

Syntax: @@sysproc:threadcreate:<pid/tid/*>

Note: The specified pid or tid must be the process or thread ID of the process or thread that created the thread.

Special built-in supported

__threadcreateinfo
{
	tid;	
	pri;       
	policy; 
}

where tid indicates the thread id of new thread that is created, and priority is the priority of the thread. Policy denotes the thread scheduling policy of the thread.

Table 48. Policy values for the threadcreate probe
Policy Description
SCHED_OTHER default AIX scheduling policy
SCHED_FIFO first in-first out scheduling policy
SCHED_RR round robin scheduling policy
SCHED_LOCAL local thread scope scheduling policy
SCHED_GLOBAL global thread scope scheduling policy
SCHED_FIFO2 FIFO with RQHEAD after short sleep
SCHED_FIFO3 FIFO with RQHEAD all the time
SCHED_FIFO4 FIFO with weak preempt

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid, __ublock, __execname, __pname.

Execution environment

Runs in process environment (user or kproc).

Example

To continuously print all processes in the system creating a thread printing process name, creating process id , id of the newly created thread and creation time-stamp.

echo '@@sysproc:threadcreate:* 
{ printf ("%s %llu %llu %A\n",__pname,__pid,__threadcreateinfo->tid,timestamp());}' | probevue

An output similar to the following example is displayed.

nfssync_kproc 5439964 23921151 Feb/22/15 09:22:38 
nfssync_kproc 5439964 24052201 Feb/22/15 09:22:38 
nfssync_kproc 5439964 23920897 Feb/22/15 09:22:38 
nfssync_kproc 5439964 22479285 Feb/22/15 09:22:55 
nfssync_kproc 5439964 23920899 Feb/22/15 09:22:55 
nfssync_kproc 5439964 22479287 Feb/22/15 09:22:55

threadterminate

The probe strarts for a thread which is terminated.

Syntax: @@sysproc:threadterminate:<pid/tid/*>

Note: Specified process ID or thread ID must be corresponding to the process or thread currently getting stopped.

Special built-ins supported

None.

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in process environment (user or kproc).

Example

To continuously print all processes in the system terminating a thread printing process name, creating process id , id of the newly created thread and creation time-stamp.

# echo '@@sysproc:threadterminate:* { printf ("%s %llu %llu %A\n",__pname,__pid,__tid,timestamp());}' | probevue 

A output similar to one shown below can be observed.
nfssync_kproc 5439964 23855555 Feb/22/15 09:59:30 
nfssync_kproc 5439964 21758249 Feb/22/15 09:59:30 
nfssync_kproc 5439964 23855557 Feb/22/15 09:59:30

threadexcept

This probe starts when a program exception occurs. A program exception is generated when system detects a condition in which a program cannot continue normally. Some exceptions are fatal (illegal instruction) while some can be recovered (address space change).

Syntax: @@sysproc:threadexcept:<pid/tid/*>

Special built-ins supported


__threadexceptinfo
{
	pid;          	       
	tid;	       	       
	exception;	      	
	excpt_address        
}

where pid denotes process ID of the process that received exception, tid is the thread ID of the kernel thread that received exception, excpt_address is address that caused this exception while exception can assume one of the values as denoted in the table.

Table 49. Exception values for the threadexcept probe
Exception Description
EXCEPT_FLOAT Floating point exception
EXCEPT_INV_OP Invalid op-code
EXCEPT_PRIV_OP Privileged op in user mode
EXCEPT_TRAP Trap instruction
EXCEPT_ALIGN Code or data alignment
EXCEPT_INV_ADDR Invalid address
EXCEPT_PROT Protection
EXCEPT_IO Synchronous I/O
EXCEPT_IO_IOCC I/O exception from IOCC
EXCEPT_IO_SGA I/O exception from SGA
EXCEPT_IO_SLA I/O exception from SLA
EXCEPT_IO_SCU I/O exception from SCU
EXCEPT_EOF Reference beyond end-of-file (mmap)
EXCEPT_FLOAT_IMPRECISE Imprecise floating point exception
EXCEPT_ESTALE_I Stale text segment exception
EXCEPT_ESTALE_D Stale data segment exception
EXCEPT_PT_WATCHP Hit ptrace watchpoint

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in process or interrupt environment.

Note: Since this probe can start in an interrupt context, built-ins variables like __pid, __tid that depend upon the execution context might not indicate the process or thread id. Special built-in members for this probe guarantee correct process or thread id intended for the process or thread.

Example

Following example shows trace program exceptions generated by a prove event being traced by a debugger.

# cat threadexcept.e 
@@sysproc:threadexcept:* 
{ 
 printf ("PID = %llu TID= %llu EXCEPTION=%llu ADDRESS = %llu\n ",__threadexceptinfo->pid,__threadexceptinfo->tid,__threadexceptinfo-
>exception,__threadexceptinfo->excpt_address); 
} 

Run a debugging session on a program compiled with debugging support

# dbx a.out 
Type 'help' for help. 
Core file "core" is older than current program (ignored) 
reading symbolic information ... 
(dbx) stop in main 
[1] stop in main 
(dbx) r 
[1] stopped in main at line 5 
    5           int a=5;

A output similar to one shown below can be observed.

PID = 6816134 TID= 24052015 EXCEPTION=131 ADDRESS = 268436372

sendsig

This probe is started when a signal is sent to a process through external sources ( other process , process from user space, from kernel streams or Interrupt context)

Syntax:@@sysproc:sendsig:<pid/*>

__dispatchinfo{
	cpuid;		<- cpu id
	
	oldpid;            <- pid of the thread currently running
	oldtid;		<- thread id of the thread currently running
	oldpriority;	<- priority of the thread currenly running
	newpid;	<- pid of the new process process selected for running 
	newtid;	<- thread id of the thread selected for running
	newpriority;	<-priority of the thread selected for running 
}

where pid id the process identifier of the target process receiving the signal. This probe does not allow specifying a thread identifier to filter results specific to a thread.

Special built-ins

_sigsendinfo{
	tpid;               ← target pid
	spid;	 	← source pid  
	signo;	       ← signal sent
}

where tpid is the target source process identifier, spid identifies source of the signal. The spid is non-zero when signal is sent from user space or process context. Source process identifier is 0 if signal is sent from an exception or interrupt context. Signal number information is contained in signo.

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in process or interrupt environment.

Note: Since this probe can start in an interrupt context, it is possible that built-ins like __pid, __tid, which depend upon thread execution context might not indicate the process or thread id of interest. Special built-in members for this probe guarantee correct process or thread id intended for the process or thread.

When this probe starts in process context, built-in members that depend on execution context point to source process. built-in members like __pid, __tid, and __curthread provide information regarding the source process.

.

Example

To continuously print signal source signal target and signal number of all signals.


echo '@@sysproc:sendsig:* {printf ("Source=%llu Target=%llu sig=%llu\n",__sigsendinfo->spid,__sigsendinfo->tpid,__sigsendinfo->signo);}' | 
probevue
 
A output similar to one shown below can be observed.

Source=0 Target=6619618 sig=14 
Source=0 Target=8257944 sig=20 
Source=0 Target=8257944 sig=20

sigqueue

This probe starts when a queued signal is being sent to the process.

Syntax:@@sysproc:sigqueue:<pid/*>

Special built-ins

_sigsendinfo{
	tpid;               ← target pid
	spid;	 	← source pid  preprocess.cp
	signo;	       ← signal sent
}

Since posix signals are queued to a process, specifying thread identifier is not allowed in this probe.

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

This probe starts in the context of the sending process. Hence, context-based built-ins refer to the sending process in this probe event.

Execution environment

This probe runs in process context.

Example


echo '@@sysproc:sigqueue:*{printf ("%llu   %llu   %llu\n",__sigsendinfo->spid,__sigsendinfo->tpid,__sigsendinfo->signo);}' | probevue

A output similar to one shown below can be observed.

8258004   6095294   31

sigdispose

Syntax : @@sysproc:sigdispose:<pid/tid/*>

Probe starts when a signal is disposed to a target process. Specify process ID of the process which received this signal in the sysprobe specification to filter this probe.

Special built-ins


__sigdisposeinfo{
	tpid;         ← target pid
	ttid;          ← target tid 
	signo;      ← signal whose action is being taken.
	fatal;        ← will be set if the process is going to be killed as part of signal action
} 

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

This probe can start from process or interrupt context. If started from interrupt context, this probe might not provide required value for context-based built-ins.

Example

Continuously print process identifier, thread identifier, signal number and indicate if this signal disposal will result in termination of the process for all processes in the system.


cat sigdispose.e

@@sysproc:sigdispose:* 
{ 
 printf ("%llu  %llu %llu %llu\n",__sigdisposeinfo->tpid,__sigdisposeinfo->ttid, __sigdisposeinfo->signo,__sigdisposeinfo->fatal); 
} 

An output similar to one shown below is observed.

5964064  20840935 14 0 
1  65539 14 0 
4719084  19530213 14 0

sigaction

Syntax:@@sysproc:sigaction:<pid/tid/*>

This probe starts when a signal handler is installed or replaced.

Special built-ins


__sigactioninfo{
	old_sighandle;            ← old signal handler function address
	new_sighandle;	←new signal handler function address 
	signo;		            ← Signal number 
	rpid;		            ← requester's pid
} 

old_sighandle will be 0 if a signal handler is installed for the first time.

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

This probe starts in process environment.

Note: AIX kernel ensures that only one signal is delivered to a process or thread at a time. Another signal to that process or thread is only sent when signal delivery is finished.

Example

To track the beginning and finish of all signals in a system:

@@sysproc:sighandlestart:* 
{ 
		 
		signal[__tid] = __sighandlestartinfo->signo; 
		printf ("Signal handler at address 0x%x invoked for thread id %llu to handle signal %llu\n",__sighandlestartinfo-
>sighandle,__curthread->tid,__sighandlestartinfo->signo); 
} 


@@sysproc:sighandlefinish:* 
{ 

		printf ("Signal handler completed for thread id %llu for  signal %llu\n",__curthread->tid,signal[__tid]); 
		delete (signal,__tid); 
}

An output similar to the one shown below can be observed.

Signal handler at address 0x20001d58 invoked for thread id 19923365 to handle signal 20 
Signal handler completed for thread id 19923365 for  signal 20 
Signal handler at address 0x10003400 invoked for thread id 20840935 to handle signal 14 
Signal handler completed for thread id 20840935 for  signal 14 
Signal handler at address 0x10002930 invoked for thread id 19530213 to handle signal 14 
Signal handler completed for thread id 19530213 for  signal 14 
Signal handler at address 0x300275d8 invoked for thread id 22348227 to handle signal 14 
Signal handler completed for thread id 22348227 for  signal 14 
Signal handler at address 0x20001a3c invoked for thread id 65539 to handle signal 14 
Signal handler completed for thread id 65539 for  signal 14

sighandlefinish

This probe starts at signal handler completion.

Syntax: @@sysproc:sighandlestart:<pid/tid/*>

Special built-ins supported: None.

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in process environment. Protected, context switch is not allowed on executing CPU.

changepriority

This probe starts when the priority of a process is being changed. This event is not a scheduler or dispatcher-enforced.

Syntax: @@sysproc:changepriority:<pid/tid/*>

Note: The priority change might also be unsuccessful; success of priority change is not guaranteed.

Special built-ins supported

__chpriorityinfo{
	pid;
	old_priority;   <- current priority 
	new_priority; <-  new scheduling priority of the thread.
}

Execution Environment

This probe runs in process environment.

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid, __ublock, __execname, __pname.

Example

To track all processes whose priority is being changed:

echo '@@sysproc:changepriority:* { printf ("%s priority changing from %llu to %llu\n",__pname,__chpriorityinfo->old_priority,__chpriorityinfo-
>new_priority);}' | probevue

An output similar to one shown below can be observed.

xmgc priority changing from 60 to 17 
xmgc priority changing from 17 to 60 
xmgc priority changing from 60 to 17 
xmgc priority changing from 17 to 60 
xmgc priority changing from 60 to 17 

offreadyq

This probe starts when a thread is removed from a system run queue.

Syntax:@@sysproc:offreadyq:<pid/tid/*>

Special built-ins supported

__readyprocinfo{
	pid;		<- process id of thread becoming ready
	tid;		<- Thread id.
	priority;	<- priority of the thread
}

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in process or interrupt environment.

Use case: Trace time taken by a thread that is performing I/O operation to get back to ready queue.

@@BEGIN 
{ 
	printf ("           Pid      Tid      Time           Delta\n"); 
}   

@@sysproc:offreadyq :*
{ 
	ready[__tid] = timestamp(); 
	printf ("offreadyq: %llu %llu %W\n",__readyprocinfo->pid,__readyprocinfo->tid,ready[__tid]); 
} 

@@sysproc:onreadyq :*
{ 
	 
	if (diff_time(ready[__tid],0,MICROSECONDS)) 
	{ 
		auto:diff = diff_time (ready[__tid],timestamp(),MICROSECONDS); 
		printf ("onreadyq : %llu %llu %W      %llu\n",__readyprocinfo->pid,__readyprocinfo->tid,ready[__tid],diff); 
		delete (ready,__tid); 
	} 
}

An output like the one showb below may be observed.

           Pid      Tid      Time           Delta 
offreadyq: 7799280 20709717 5s 679697µs 
onreadyq : 7799280 20709717 5s 679697µs      6 
offreadyq: 7799280 20709717 5s 908716µs 
onreadyq : 7799280 20709717 5s 908716µs      3 
offreadyq: 7799280 20709717 6s 680186µs 
onreadyq : 7799280 20709717 6s 680186µs      5 
offreadyq: 7799280 20709717 6s 710720µs 
onreadyq : 7799280 20709717 6s 710720µs      4 
offreadyq: 7799280 20709717 6s 800720µs 
onreadyq : 7799280 20709717 6s 800720µs      2 
offreadyq: 7799280 20709717 6s 882231µs 
onreadyq : 7799280 20709717 6s 882231µs      2 
offreadyq: 7799280 20709717 6s 962313µs 
onreadyq : 7799280 20709717 6s 962313µs      2 
offreadyq: 7799280 20709717 6s 980311µs 
onreadyq : 7799280 20709717 6s 980311µs      2 

onreadyq

This probe starts when a thread is enqueued to system ready queue or its position in ready queue is modified.

Syntax:@@sysproc:offreadyq:<pid/tid/*>

Special built-ins supported

__readyprocinfo{
	pid;		<- process id of thread becoming ready
	tid;		<- Thread id.
	priority;	<- priority of the thread
}

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in process or interrupt environment.

dispatch

This probe starts when system dispatcher is called to select a thread to run on a specific CPU.

Syntax:@@sysproc:dispatch:<pid/tid/*>

Special built-in supported

__dispatchinfo{
	cpuid;		<- CPU where selected  thread will run. 
	oldpid;            <- pid of the thread currently running
	oldtid;		<- thread id of the thread currently running
	oldpriority;	<- priority of the thread currenly running
	newpid;	<- pid of the new process process selected for running 
	newtid;	<- thread id of the thread selected for running
	newpriority;	<-priority of the thread selected for running 
}

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in interrupt environment only.

Example

print process thread id of old and selected thread on CPU '0' with  dispatch time relative to start of the script


echo '@@sysproc:dispatch:* when (__cpuid == 0){printf ("%llu %llu %W\n",__dispatchinfo->oldtid,__dispatchinfo->newtid,timestamp());}' | 
probevue

An output similar to the one shown below can be observed.

24641983 20709717 0s 48126µs 
20709717 23593357 0s 48164µs 
23593357 20709717 0s 48185µs 
20709717 23593357 0s 48214µs 
23593357 20709717 0s 48230µs 
20709717 23593357 0s 48288µs 
23593357 261 0s 48303µs 
261 20709717 0s 48399µs

Example II

Time spent on  CPU '0' by threads in between dispatch event.

@@BEGIN 
{ 
	printf ("Thread cpu Time-Spent\n"); 
} 

@@sysproc:dispatch:* when (__cpuid == $1) 
{ 
	if (savetime[__cpuid] != 0) 
		auto:diff = diff_time (savetime[__cpuid],timestamp(),MICROSECONDS); 
	else 
		diff = 0; 
	savetime[__cpuid] = timestamp(); 
	printf ("%llu %llu %llu\n",__dispatchinfo->oldtid,__dispatchinfo->cpuid,diff); 
}		 
	
# probevue cputime.e 6 
Thread cpu Time-Spent 
3146085 6 0 
3146085 6 9995 
3146085 6 10002 
3146085 6 10008 
3146085 6 99988 
3146085 6 100006 
3146085 6 99995 
3146085 6 99989 
3146085 6 100010 
3146085 6 100001 
3146085 6 100000 
3146085 6 99998 

As can be observed thread 3146085 is being re-dispatched on the CPU at an interval of 1sec in absence of any other thread competing for this 
CPU.

oncpu

This probe starts when a new process or thread acquires CPU.

Syntax:@@sysproc:oncpu:<pid/tid/*>

Where pid is process identifier and tid is thread identifier of process or thread that is acquiring the CPU.

Special built-ins supported

__dispatchinfo{
	cpuid;		<- CPU where selected  thread will run. 
	newpid;	<- pid of the new process process selected for running 
	newtid;	<- thread id of the thread selected for running
	newpriority;	<-priority of the thread selected for running 
}

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in interrupt environment only.

Example

To print time spent by threads of sysncd on all CPU's
#!/usr/bin/probevue

@@BEGIN

{

	printf ("PROCESSID THREADID CPU TIME\n");

}



@@sysproc:oncpu:$1 

{

	savetime[__cpuid] = timestamp();

}



@@sysproc:offcpu:$1  

{

	if (savetime[__cpuid] != 0)

		auto:diff = diff_time (savetime[__cpuid],timestamp(),MICROSECONDS);

	else

		diff = 0;

	printf ("%llu %llu %llu %llu\n",

		__dispatchinfo->oldpid,

		__dispatchinfo->oldtid,

		__dispatchinfo->cpuid,

		diff);

}


# cputime.e `ps aux|grep syncd| grep -v grep| cut -f 6 -d " "`

An output like on the shown below can be observed.

3735998 18612541 0 2

3735998 15663427 0 1

3735998 15073557 0 1

3735998 18743617 0 1

3735998 18874693 0 1

3735998 18809155 0 15

3735998 18940231 0 20

3735998 18547003 0 1

3735998 19267921 0 1

3735998 19071307 0 17

3735998 18678079 0 1

3735998 18481465 0 1

3735998 19202383 0 15

3735998 19005769 0 1

3735998 19136845 0 19

3735998 6160689 0 190

offcpu

This probe starts when a process or thread is dispatched from a CPU.

Syntax:@@sysproc:dispatch:<pid/tid/*>

Special built-ins supported

__dispatchinfo{
	cpuid;		<- CPU where selected  thread will run. 
	newpid;	<- pid of the new process process selected for running 
	newtid;	<- thread id of the thread selected for running
	newpriority;	<-priority of the thread selected for running 
}

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in interrupt environment only.

blockthread

This probe starts when a thread is blocked from running on a CPU. Blocking is a form of sleeping when a thread sleeps without holding any resources.

Syntax: @@sysproc:blockthread:*

Special built-ins supported

__sleepinfo{
	pid;
	tid;
	waitchan;   <-- wait channel of this sleep.	
}

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in interrupt environment only.

foldcpu

This probe starts when a CPU core is about to be folded. This probe does not happen in process context and must not be filtered with a pid or tid.

Syntax: @@sysproc:foldcpu:*

Special built-ins supported


__foldcpuinfo{
	cpuid;		<- logical cpu id which triggers core folding 
	gpcores;         <- general purpose (unfolded, non-exclusive) cores available.
}

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7.

Example:

To track all CPU folding events in the system:

__foldcpuinfo{
	cpuid;		<- logical cpu id which triggers core folding 
	gpcores;         <- general purpose (unfolded, non-exclusive) cores available.
}

bindprocessor

Syntax: @@sysproc:bindprocessor:<pid/tid/*>

This probe starts when a thread or process is bound to a CPU. Bindprocessor is a permanent event and must not be confused with temporary CPU switches.

Special built-ins supported


__bindprocessorinfo{
	ispid       <- 1 if cpu is bound to process; 0 for a thread 
	id;	    <- thread or process id.
	
	cpuid;
	
};

Other supported built-ins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in process environment.

changecpu

This probe starts when a thread changes CPU temporarily. This event is more likely to be captured during CPU funneling events or intentional jumps of some kproc events to perform CPU related tasks (the xmgc process jumps to all CPUs to manage kernel heaps) special built-ins.

Syntax: @@sysproc:changecpu:*>

Special built-ins supported

__changecpuinfo
{
	oldcpuid;	<-source CPU
	newcpuid; 	<- target CPU   
	pid;
	tid;		<-Thread id
}

Other supported builtins

__errno__kernelmode, __arg1 to __arg7, __curthread, __curproc, __mst, __tid, __pid, __ppid, __pgid, __uid, __euid.

Execution environment

Runs in process environment.

Example

@@sysproc:changecpu:*

{
printf ("changecpu PID=%llu TID=%llu old_cpuid=%d new_cpuid= %d \n",
__changecpuinfo->pid,__changecpuinfo->tid,__changecpuinfo->oldcpuid,__changecpuinfo->newcpuid);

}


An output like the one shown below may be observed.

changecpu PID=852254 TID=1769787 old_cpuid=26 new_cpuid= 27 

changecpu PID=852254 TID=1769787 old_cpuid=-1 new_cpuid= 0 

changecpu PID=852254 TID=1769787 old_cpuid=0 new_cpuid= 1 

changecpu PID=852254 TID=1769787 old_cpuid=1 new_cpuid= 2 

resourceattach

This probe is fired when a resource is attached to another resource in the system.

Syntax: @@sysproc:resourceattach:*>

Special built-ins supported

__srcresourceinfo{
	type;
	subtype;
	id;		<- resource type identifier
	offset; 		<-offset if a memory resource
	length;		<- length if a memory resource
	policy;
}
__tgtresourceinfo{
	type;
	subtype;
	id;		<- resource type identifier
	offset;		<-offset if a memory resource
	length;		<- length if a memory resource
	policy;		
}
Where type and subtype could be have one of the following values.
Table 50. The resourceattach probe: type and subtype values
Resource type Description
R_NADA Nothing - invalid specification
R_PROCESS Process
R_RSET Resource set
R_SUBRANGE Memory range
R_SHM Shared Memory
R_FILDES File identified by an open file
R_THREAD Thread
R_SRADID SRAD identifier
R_PROCMEM Process Memory

Other supported builtins

__errno__kernelmode, __arg1 to __arg7, __mst.

Execution environment

Runs in process environment.

resourcedetach

This probe is fired when a resource is detached from another resource in the system.

Syntax: @@sysproc:resourcedetach:*>

Special built-ins supported

__srcresourceinfo{
	type;
	subtype;
	id;		<- resource type identifier
	offset; 		<-offset if a memory resource
	length;		<- length if a memory resource
	policy;
}

__tgtresourceinfo{
	type;
	subtype;
	id;		<- resource type identifier
	offset;		<-offset if a memory resource
	length;		<- length if a memory resource
	policy;		
}
Where type and subtype could be have one of the following values.
Table 51. The resourcedetach probe: type and subtype values
Resource type Description
R_NADA Nothing - invalid specification
R_PROCESS Process
R_RSET Resource set
R_SUBRANGE Memory range
R_SHM Shared Memory
R_FILDES File identified by an open file
R_THREAD Thread
R_SRADID SRAD identifier
R_PROCMEM Process Memory

Other supported builtins

__errno__kernelmode, __arg1 to __arg7, __mst, __tid, __pname.

Execution environment

Runs in process environment.

drphasestart

This probe is fired when a dr handler is about to be called.

Syntax: @@sysproc:drphasestart:*

Special built-ins supported


__drphaseinfo{
	dr_operation;   ← dr operation
	dr_flags;	
	dr_phase;
	handler_rc;    ← always 0 in drphasestart
}

dr_operation can have one of the following values:

  • DR operation
  • DR_RM_MEM_OPER
  • DR_ADD_MEM_OPER
  • DR_RM_CPU_OPER
  • DR_ADD_CPU_OPER
  • DR_CPU_SPARE_OPER
  • DR_RM_CAP_OPER
  • DR_ADD_CAP_OPER
  • DR_RM_RESMEM_OPER
  • DR_PMIG_OPER
  • DR_WMIG_OPER
  • DR_WMIG_CHECKPOINT_OPER
  • DR_WMIG_RESTART_OPER
  • DR_SOFT_RES_CHANGES_OPER
  • DR_ADD_MEM_CAP_OPER
  • DR_RM_MEM_CAP_OPER
  • DR_CPU_AFFINITY_REFRESH_OPER
  • DR_AME_FACTOR_OPER
  • DR_PHIB_OPER
  • DR_ACC_OPER
  • DR_CHLMB_OPER
  • DR_ADD_RESMEM_OPER
dr flags can be a combination of the following values:
  • Flag
  • DRP_FORCE
  • DRP_RPDP
  • DRP_DOIT_SUCCESS
  • DRP_PRE_REGISTERED
  • DRP_CPU DRP_MEM DRP_SPARE
  • DRP_ENT_CAP
  • DRP_VAR_WGT
  • DRP_RESERVE
  • DRP_PMIG DRP_WMIG
  • DRP_WMIG_CHECKPOINT
  • DRP_WMIG_RESTART
  • DRP_SOFT_RES_CHANGES
  • DRP_MEM_ENT_CAP
  • DRP_MEM_VAR_WGT
  • DRP_CPU_AFFINITY_REFRESH
  • DRP_AME_FACTOR
  • DRP_PHIB
  • DRP_ACC_UPDATE
  • DRP_CHLMB

Other supported builtins

__errno__kernelmode, __arg1 to __arg7, __tid

Execution environment

Runs in process or interrupt environment.

Example