Running ProbeVue
Dynamic tracing is only allowed for users with privileges or for the superuser.
Authorizations and privileges
This is unlike the static tracing facilities in AIX, which enforce relatively limited privilege checking. There is a reason for requiring privileges to run the probevue command. A Vue script can potentially produce more severe impacts on system performance than a static tracing facility like AIX system trace. This is because probe points for system trace are pre-defined and restricted. ProbeVue can potentially support many more probe points and the probe locations can potentially be defined almost anywhere. Further, ProbeVue trace actions at a probe point can take much longer to issue than the system trace actions at a probe point since those are limited to explicit data capture.
In addition, ProbeVue allows you to trace processes and read kernel global variables, both of which need to be controlled to prevent security exposures. A ProbeVue session can also consume a lot of pinned memory and restricting usage of ProbeVue to users with privilege reduces the risk of denial of service attacks. ProbeVue also allows administrators to control the memory usage of ProbeVue sessions through the SMIT interface.
Privileges for dynamic tracing are obtained differently depending upon whether role-based access control (RBAC) is enabled or not. Please refer to the AIX® man pages for more information about enabling and disabling RBAC.
Note that in legacy or RBAC-disabled mode, there are no authorizations. Regular users cannot acquire privileges to run the probevue command to start a dynamic tracing session or run the probevctrl command to administer ProbeVue. Only the superuser can have privileges for both these functions. Do not disable RBAC when using ProbeVue unless you prefer to restrict this facility to root users only.
RBAC-enabled mode
Privileges in an RBAC system are obtained through authorizations. An authorization is a text string associated with security-related functions or commands. Authorizations provide the mechanism to grant rights to you to perform privileged actions. Only a user with sufficient authorization can issue the probevue command and start a dynamic tracing session.
- aix.ras.probevue.trace.user.self
- This authorization allows you to trace their applications in user space. The user ID of the process to be traced must be equal to the real user ID of the user invoking the probevue command. This authorization allows you to enable probe points provided by the uft probe manager for your processes. However, the effective, real and saved user IDs of the process to be traced must be equal. Thus, you cannot trace setuid programs with just this authorization.
- aix.ras.probevue.trace.user
- This authorization allows you to trace any application in user space including setuid programs and applications started by the superuser. Be careful when handing out this authorization. This authorization allows you to issue the probevue command and enable probe points provided by the uft probe manager for any process on the system.
- aix.ras.probevue.trace.syscall.self
- This authorization allows you to trace system calls made by their applications. The effective, real and saved user IDs of the process making the system call must be the same and equal to the real user ID of the user invoking the probevue command. This authorization allows you to enable probe points provided by the syscall probe manager for your processes. The second field of the probe specification must indicate the process ID for a process started by you.
- aix.ras.probevue.trace.syscall
- This authorization allows you to trace system calls made by any application on the system including setuid programs and applications started by the superuser. Be careful when handing out this authorization. This authorization allows you to issue the probevue command and enable probe points provided by the syscall probe manager for any process. The second field of the probe specification can either be set to a process ID to probe a specific process or to * to probe all processes.
- aix.ras.probevue.trace
- This authorization allows you to trace the entire system and includes all the authorizations defined in the preceding sections. You can also access and read kernel variables when running the probevue command, trace system trace events by using the systrace probe manager and trace the CPU bound probes by using the interval probe manager. Be careful while using this authorization.
- aix.ras.probevue.manage
- This authorization allows you to administer ProbeVue. This includes changing the values of the different ProbeVue parameters, starting or stopping ProbeVue and viewing details of dynamic tracing sessions of all users when running the probevctrl command. Without this authorization, you can use the probevctrl command to view session data for dynamic tracing sessions started by you or view the current values for ProbeVue parameters.
- aix.ras.probevue.rase
- This authorization allows you to access to a highly privileged set of "RAS events" Vue functions which can produce system and LMT trace records, create live dumps, and even lead to the system abend. This privilege must be very carefully controlled.
- aix.ras.probevue
- This authorization grants all dynamic tracing privileges and is equivalent to all the preceding authorizations combined.
The superuser (or root) has all these authorizations assigned by default. Other users will need to have authorizations assigned to them by first creating a role with a set of authorizations and assigning the role to the user. The user will also need to switch roles to a role that has the required authorizations defined for dynamic tracing before invoking the probevue command. The following script is an example of how to provide user "joe" authorization to enable user space and system call probes for processes started by "joe".
mkrole authorizations=
"aix.ras.probevue.trace.user.self,aix.ras.probevue.trace.syscall.self"
apptrace
chuser roles=apptrace joe
setkst -t roleTR
ng command:
swrole apptrace
ProbeVue privileges
The privileges that are available for ProbeVue are listed in the following table. A description of each privilege and the authorizations that map to that privilege is provided. Privileges form a hierarchy where the parent privilege contains all of the rights that are associated with the privileges of its children, but it can include additional privileges also.
Privilege | Description | Authorizations | Associated command |
---|---|---|---|
PV_PROBEVUE_ TRC_USER_SELF | Allows a process to enable dynamic user space probe points on another process with the same real user ID. | aix.ras.probevue.trace.user.self aix.ras.probevue.trace.user aix.ras.probevue.trace aix.ras.probevue | probevue |
PV_PROBEVUE_ TRC_USER | Allows a process to enable dynamic user space probe points. Includes the PV_PROBEVUE_ TRC_USER_SELF privilege. | aix.ras.probevue.trace.user aix.ras.probevue.trace aix.ras.probevue | probevue |
PV_PROBEVUE_ TRC_SYSCALL_SELF | Allows a process to enable dynamic system call probe points on another process with the same real user ID. | aix.ras.probevue.trace.syscall.self aix.ras.probevue.trace.syscall aix.ras.probevue.trace aix.ras.probevue | probevue |
PV_PROBEVUE_ TRC_SYSCALL | Allows a process to enable dynamic system call space probe points. Includes
the PV_PROBEVUE_ TRC_SYSCALL_ SELF privilege. |
aix.ras.probevue.trace.syscall aix.ras.probevue.trace aix.ras.probevue | probevue |
PV_PROBEVUE _TRC_KERNEL | Allows a process to access kernel data when dynamic tracing. | aix.ras.probevue.trace aix.ras.probevue | probevue |
PV_PROBEVUE_ MANAGE | Allows a process to administer ProbeVue. | aix.ras.probevue.manage aix.ras.probevue | probevctrl |
PV_PROBEVUE_ RASE | Authorizes the use of the restricted "RAS events" functions. | aix.ras.probevue.rase aix.ras.probevue | probevue |
PV_PROBEVUE_ | Equivalent to all the preceding privileges (PV_PROBEVUE_*) combined. | aix.ras.probevue | probevue probevctrl |
ProbeVue parameters
All ProbeVue parameters can be modified through the SMIT interface (use the "smit probevue" fast path) or directly through the probevctrl command. ProbeVue can be stopped if there are no active dynamic tracing sessions and it can be restarted after stopping it without requiring a reboot. ProbeVue can fail to stop if any sessions that used thread-local variables had been previously active.
The following table summarizes the parameters defined for dynamic tracing sessions. In the description, a privileged user refers to the superuser or a user with the aix.ras.probevue.trace authorization and a non-privileged user is one who does not have this authorization.
Description as in SMIT | Maximum value | Initial high configuration value | Initial low configuration value | Minimum value | Associated command |
---|---|---|---|---|---|
MAX pinned memory for ProbeVue framework | 64 GB | 10% of available memory or the maximum value, whichever is smaller. | 16 MB | 3 MB | Maximum pinned memory in MB that is allocated for ProbeVue data structures, including per-CPU
stacks and per-CPU local table regions and by all dynamic tracing sessions. It does not include any
memory allocated by Probe Managers. Note: Although, this parameter can be modified at any time, the
value takes effect only the next time ProbeVue is started.
|
Default per-CPU trace buffer size | 256 MB | 128 KB | 8 KB | 4 KB | Default size in KB of per-CPU trace buffer. Two trace buffers are allocated per CPU for each dynamic tracing session by ProbeVue, one active and used by the writer or the Vue program when it captures trace data and one inactive and used by the reader or the trace consumer. |
For example, on an 8-way with per-CPU trace buffer size set to 16 KB, the total memory consumed by the trace buffers for a ProbeVue session is 256 KB. You can specify a different buffer size (larger or smaller) when you start the probevue command until it is within the session memory limits. | |||||
MAX pinned memory for regular user sessions | 64 GB | 2 MB | 2 MB | 0 MB | Maximum pinned memory allocated for a non-privileged user ProbeVue session including memory for the per-CPU trace buffers. A value of 0 effectively disables all non-privileged users. Privileged users have no limits on the memory used by their ProbeVue sessions. However, they are still limited by the maximum pinned memory allowed for the ProbeVue framework. |
MIN trace buffer read rate for regular user | 5000 ms | 100 ms | 100 ms | 10 ms | The minimum period, in milliseconds, that a non-privileged user can request the trace consumer to check for trace data. This value is internally rounded to the next highest multiple of 10 milliseconds. Privileged users are not limited by this parameter, but the fastest read rate that they can specify is 10 milliseconds. |
Default trace buffer read rate | 5000 ms | 100 ms | 100 ms | 10 ms | The default period in milliseconds that the in-memory trace buffers are checked for trace data by the trace consumer. You can specify a different read rate (larger or smaller) when starting the probevue command until it is larger than the minimum buffer read rate. |
MAX concurrent sessions for regular user | 8 | 1 | 1 | 0 | Number of concurrent ProbeVue sessions allowed for a non-privileged user. A value of zero effectively disables all non-privileged users. |
Size of per-CPU computation stack | 256 KB | 20 KB | 12 KB | 8 KB | The size of the per-CPU computation stack used by ProbeVue when issuing the Vue script. The value is rounded to the next highest multiple of 8 KB. ProbeVue allocates a single stack per-CPU for all ProbeVue sessions. The memory consumed for the stacks is not included in the per-session limits. |
Note: Although, this parameter can be modified at any time, the value takes
effect only after AIX kernel boot image is rebuilt and
rebooted. You have to configure ProbeVue stack to use 96K virtual memory to get the current
directory listing.
|
|||||
Size of per-CPU local table size | 256 KB | 32 KB | 4 KB | 4 KB | The size of the per-CPU local table used by ProbeVue for saving variables of automatic class and for saving temporary variables. ProbeVue uses half of this area for automatic variables and the remaining half for saving temporary variables. |
The value is always rounded to the next highest
multiple of 4 KB. ProbeVue allocates a single local table and a single temporary table per-CPU used
by all ProbeVue sessions. The memory consumed for the local tables is not included in the
per-session limits. Note: Although, this parameter can be modified at any time, the value takes
effect only the next time ProbeVue is started.
|
|||||
MIN interval allowed in an interval probe | N/A | 1 | 1 | Minimum timer interval, in milliseconds, allowed for global root user in interval probes. | |
Number of threads to be traced | N/A | 32 | 32 | 1 | Maximum number of threads that a ProbeVue session can support when it has thread-local variables. The ProbeVue framework allocates the thread-local variables to the maximum number of threads that are specified with this attribute, at the start of the session. If more than the specified number of threads hit the probe that has a thread-local variable, the ProbeVue session is abruptly stopped. |
Number of page faults to be handled | 1024 | 0 | 0 | 0 | Number of page fault contexts for handling page faults for the entire framework. A page fault context includes stack and local table for saving automatic class variables and temporary variables. A page fault context is required to access the paged-out data. If there are no page fault context that is free at the time of a page fault, ProbeVue does not fetch the paged-out data. |
Maximum probe execution time for systrace probes when fired in interrupt context | N/A | 0 | 0 | 0 | This number limits the maximum time, in milliseconds, a systrace probe executing in interrupt context can take. By default, the value is zero, which means the systrace probe can any time. |
Maximum probe execution time for io probes when fired in interrupt context | N/A | 0 | 0 | 0 | This number limits the maximum time, in milliseconds, an io probe executing in interrupt context can take. By default, the value is zero, which means it can any time |
Maximum probe execution time for sysproc probes when fired in interrupt context | N/A | 0 | 0 | 0 | This number limits the maximum time, in milliseconds, a sysproc probe executing in interrupt context can take. By default, the value is zero, which means it can any time. |
Maximum probe execution time for network probes when fired in interrupt context | N/A | 0 | 0 | 0 | This number limits the maximum time, in milliseconds, a network probe executing in interrupt context can take. By default, the value is zero, which means it can any time. |
Max network buffer size | 64 KB | 64 bytes | 96 bytes | 96 bytes | This value is a pre-allocated buffer size (in bytes) used by network probe manager for bpf probe points. This value is allocated when the first bpf probe is enabled and exists in the system till the last bpf probe is disabled. When the last bpf probe type is disabled, this buffer is released. This buffer is used to copy the data when packet data is spanned across multiple packet buffers. |
Asynchronous statistics fetch interval, in milliseconds | NA | 1000 milliseconds (1 second) | 1000 milliseconds (1 second) | 100 milliseconds | The interval, in milliseconds, to fetch the asynchronous statistics. This value is global and is applicable to all ProbeVue sessions. |
Fetch statistics in asynchronous mode only | NA | No | No | NA | Specifies ProbeVue that statistics must be fetched in an asynchronous mode even if synchronous mode is available. |
Maximum probe execution time for CPU bound interval probes when fired in interrupt context. | 60 seconds | 60 seconds | 100 milliseconds | 100 milliseconds | This number limits the maximum time, in milliseconds, a CPU bound interval probe executing in interrupt context can take. By default, the value is 60secs. |
Profiling ProbeVue Session
The ProbeVue framework provides a profiling facility that can be turned on or off to estimate the impact of enabled probes on the application. This facility accumulates the time taken by probe actions when they are started and reports when requested or when the session ends.
The profiling report displays the probe string and the time taken by the action corresponding to that probe string. The time that is consumed by the probe action is maintained as a list where the data collected is total, minimum, maximum, and average time taken by probe action. Profiling data also displays number of times that the probe action was timed. When you are looking up the profile for multiple functions through one probe string (by using regular expression or * in place of function name), profiling data provides an accumulated data of probes started for all such functions. It does not provide timing details for functions that are probed separately but only per-probe action.
The BEGIN
and END
probe actions are not profiled
with this facility. These profiling details are session-specific details. You can enable probevue
session profiling along with session start by using the probevue command or
probevctrl command.
For more information, see the probevue and probevctrl commands.
Sample programs
Example 1
The following canonical "Hello World" program prints "Hello World" into the trace buffer and exits:
#!/usr/bin/probevue
/* Hello World in probevue */
/* Program name: hello.e */
@@BEGIN
{
printf("Hello World\n");
exit();
}
Example 2
The following "Hello World" program prints "Hello World" when you types Ctrl-C on the keyboard:
#!/usr/bin/probevue
/* Hello World 2 in probevue */
/* Program name: hello2.e */
@@END
{
printf("Hello World\n");
}
Example 3
The following program shows how to use thread-local variables. This Vue script counts the number of bytes written to a particular file. It assumes that the processes are single-threaded or those threads that open files are the same ones that write to them. It also assumes that all write operations are successful. The script can be terminated at any time and you can obtain the current count of bytes written by typing Ctrl-C on the terminal.
#!/usr/bin/probevue
/* Program name: countbytes.e */
int open( char * Path, int OFlag, int mode );
int write( int fd, char * buf, int sz);
int done;
@@syscall:*:open:entry
when (done != 0 )
{
if (get_userstring(__arg1, -1) == "/tmp/foo") {
thread:trace = 1;
done = 1;
}
}
@@syscall:*:open:exit
when (thread:trace)
{
thread:fd = __rv;
}
@@syscall:*:write:entry
when (thread:trace && __arg1 == thread:fd)
{
bytes += __arg3; /* number of bytes is third arg */
}
@@END
{
printf("Bytes written = %d\n", bytes);
}
Example 4
The following tentative tracing program shows how to trace the arguments passed to the read system call only if it returns zero bytes when reading the foo.data file:
#!/usr/bin/probevue
/* File: ttrace.e */
/* Example of tentative tracing */
/* Capture parameters to read system call only if read fails */
int open ( char* Path, int OFlag , int mode );
int read ( int fd, char * buf, int sz);
@@syscall:*:open:entry
{
filename = get_userstring(__arg1, -1);
if (filename == "foo.data") {
thread:open = 1;
start_tentative("read");
printf("File foo.data opened\n");
}
}
@@syscall:*:open:exit
when (thread:open == 1)
{
thread:fd = __rv;
start_tentative("read");
printf("fd = %d\n", thread:fd);
thread:open = 0;
}
@@syscall:*:read:entry
when (__arg1 == thread:fd)
{
start_tentative("read");
printf("Read fd = %d, input buffer = 0x%08x, bytes = %d,",
__arg1, __arg2, __arg3);
end_tentative("read");
thread:read = 1;
}
@@syscall:*:read:exit
when (thread:read == 1)
{
if (__rv < 0) {
/* The printf below, even though non-tentative, is only
* executed in error cases and merges with the
* previously printed tentative data
*/
printf(" errno = %d\n", __errno);
commit_tentative("read");
}
else
discard_tentative("read");
thread:read = 0;
}
A possible output if the read failed because a bad address (say 0x1000) was passed as input buffer pointer could look like the following output:
#probevue ttrace.e
File foo.data opened
fd = 4
Read fd = 4, input buffer = 0x00001000, bytes = 256, errno = 14
Example 5
The following Vue script prints the values of some kernel variables and exits immediately. Pay attention to the exit function in the @@BEGIN probe:
/* File: kernel.e */
/* Example of accessing kernel variables */
/* System configuration structure from /usr/include/sys/systemcfg.h */
struct system_configuration {
int architecture; /* processor architecture */
int implementation; /* processor implementation */
int version; /* processor version */
int width; /* width (32 || 64) */
int ncpus; /* 1 = UP, n = n-way MP */
int cache_attrib; /* L1 cache attributes (bit flags) */
/* bit 0/1 meaning */
/* -------------------------------------*/
/* 31 no cache / cache present */
/* 30 separate I and D / combined */
int icache_size; /* size of L1 instruction cache */
int dcache_size; /* size of L1 data cache */
int icache_asc; /* L1 instruction cache associativity */
int dcache_asc; /* L1 data cache associativity */
int icache_block; /* L1 instruction cache block size */
int dcache_block; /* L1 data cache block size */
int icache_line; /* L1 instruction cache line size */
int dcache_line; /* L1 data cache line size */
int L2_cache_size; /* size of L2 cache, 0 = No L2 cache */
int L2_cache_asc; /* L2 cache associativity */
int tlb_attrib; /* TLB attributes (bit flags) */
/* bit 0/1 meaning */
/* -------------------------------------*/
/* 31 no TLB / TLB present */
/* 30 separate I and D / combined */
int itlb_size; /* entries in instruction TLB */
int dtlb_size; /* entries in data TLB */
int itlb_asc; /* instruction tlb associativity */
int dtlb_asc; /* data tlb associativity */
int resv_size; /* size of reservation */
int priv_lck_cnt; /* spin lock count in supevisor mode */
int prob_lck_cnt; /* spin lock count in problem state */
int rtc_type; /* RTC type */
int virt_alias; /* 1 if hardware aliasing is supported */
int cach_cong; /* number of page bits for cache synonym */
int model_arch; /* used by system for model determination */
int model_impl; /* used by system for model determination */
int Xint; /* used by system for time base conversion */
int Xfrac; /* used by system for time base conversion */
int kernel; /* kernel attributes */
/* bit 0/1 meaning */
/* -----------------------------------------*/
/* 31 32-bit kernel / 64-bit kernel */
/* 30 non-LPAR / LPAR */
/* 29 old 64bit ABI / 64bit Large ABI */
/* 28 non-NUMA / NUMA */
/* 27 UP / MP */
/* 26 no DR CPU add / DR CPU add support */
/* 25 no DR CPU rm / DR CPU rm support */
/* 24 no DR MEM add / DR MEM add support */
/* 23 no DR MEM rm / DR MEM rm support */
/* 22 kernel keys disabled / enabled */
/* 21 no recovery / recovery enabled */
/* 20 non-MLS / MLS enabled */
long long physmem; /* bytes of OS available memory */
int slb_attr; /* SLB attributes */
/* bit 0/1 meaning */
/* -----------------------------------------*/
/* 31 Software Managed */
int slb_size; /* size of slb (0 = no slb) */
int original_ncpus; /* original number of CPUs */
int max_ncpus; /* max cpus supported by this AIX image */
long long maxrealaddr; /* max supported real memory address +1 */
long long original_entitled_capacity;
/* configured entitled processor capacity */
/* at boot required by cross-partition LPAR */
/* tools. */
long long entitled_capacity; /* entitled processor capacity */
long long dispatch_wheel; /* Dispatch wheel time period (TB units) */
int capacity_increment; /* delta by which capacity can change */
int variable_capacity_weight; /* priority weight for idle capacity*/
/* distribution */
int splpar_status; /* State of SPLPAR enablement */
/* 0x1 => 1=SPLPAR capable; 0=not */
/* 0x2 => SPLPAR enabled 0=dedicated; */
/* 1=shared */
int smt_status; /* State of SMT enablement */
/* 0x1 = SMT Capable 0=no/1=yes */
/* 0x2 = SMT Enabled 0=no/1=yes */
/* 0x4 = SMT threads bound true 0=no/1=yes */
int smt_threads; /* Number of SMT Threads per Physical CPU */
int vmx_version; /* RPA defined VMX version, 0=none/disabled */
long long sys_lmbsize; /* Size of an LMB on this system. */
int num_xcpus; /* Number of exclusive cpus on line */
signed char errchecklevel;/* Kernel error checking level */
char pad[3]; /* pad to word boundary */
int dfp_version; /* RPA defined DFP version, 0=none/disabled */
/* if MSbit is set, DFP is emulated */
};
__kernel struct system_configuration _system_configuration;
@@BEGIN
{
String s[40];
int j;
__kernel int max_sdl; /* Atomic RAD system decomposition level */
__kernel long lbolt; /* Ticks since boot */
printf("No. of online CPUs\t\t= %d\n", _system_configuration.ncpus);
/* Print SMT status */
printf("SMT status\t\t\t=");
if (_system_configuration.smt_status == 0)
printf(" None");
else {
if (_system_configuration.smt_status & 0x01)
printf(" Capable");
if (_system_configuration.smt_status & 0x02)
printf(" Enabled");
if (_system_configuration.smt_status & 0x04)
printf(" BoundThreads");
}
printf("\n");
/* Print error checking level */
if (_system_configuration.errchecklevel == 1)
s = "Minimal";
else if (_system_configuration.errchecklevel == 3)
s = "Normal";
else if (_system_configuration.errchecklevel == 7)
s = "Detail";
else if (_system_configuration.errchecklevel == 9)
s = "Maximal";
printf("Error checking level\t\t= %s\n",s);
printf("Atomic RAD system detail level\t= %d\n", max_sdl);
/* Long in the kernel is 64-bit, so we use %lld below */
printf("Number of ticks since boot\t= %lld\n", lbolt);
exit();
}
The following output is a possible output when you run the preceding script on a Power 5 dedicated partition with default kernel attributes:
# probevue kernel.e
No. of online CPUs = 4
SMT status = Capable Enabled BoundThreads
Error checking level = Normal
Atomic RAD system detail level = 2
Number of ticks since boot = 34855934