splat 命令

用途

简单性能锁分析工具(splat)。 提供内核和 pthread 锁使用情况报告。

语法

splat -i file [ -n file ] [ -o file ] [ -d [ bfta ]] [ -l address ] [ -c class] [ -s [ acelmsS ]] [ -C cpus ] [ -S count ] [ -t start] [ -T stop] [ -p ]

splat -h [主题]

splat -j

描述

splat(简单性能锁分析工具)是一款软件工具,可对AIX®跟踪文件进行后处理,生成内核简单和复杂锁使用情况报告。 它也生成 pthread 互斥读写锁和条件变量使用报告。

标志

描述
-i 输入文件 AIX 跟踪文件 (必需)。
-n 名称文件 包含 gensyms 命令输出的文件。
-o 输出文件 写入报告的文件(缺省:stdout)。
-d 详细信息 详细信息可以是:[b]基本:摘要和锁详细信息(缺省值) [f]函数:基本 + 函数详细信息 [t]线程:基本 + 线程详细信息 [a]全部:基本 + 函数 + 线程详细信息
-c 如果用户提供十进制锁类索引,splat 将只报告该类中锁的活动。
-l 地址 如果用户提供十六进制锁地址,splat 将只报告该地址中锁的活动。 splat 将过滤包含该锁地址的锁 hook 的跟踪文件,并单独为该锁生成报告。
-s 条件 按下列条件对锁、函数和线程报告排序:
a
获取
c
处理器占用时间百分比
e
已占用时间百分比
l
锁地址、函数地址或线程标识
m
失误率
s
轮转计数
S
处理器轮转占用时间百分比(缺省)
w
实际等待时间百分比
W
平均等待队列深度
-C 计算机 指定此跟踪的现有处理器数。
-S 计数 每个报告中的最大条目数(缺省:10)。
-t 启动时间 从跟踪开始起的时间偏移量(以秒为单位)。
-T 停止时间 从跟踪开始到停止分析跟踪数据的时间偏移量(以秒为单位)。 (缺省:跟踪结束)。
-h [主题] 有关使用或特定主题的帮助。 有效主题为:
  • 全部
  • 概述
  • 输入
  • 名称
  • 报告
  • 排序
-j 显示 splat 使用的跟踪 hook 的列表。
-p 指定使用 PURR 寄存器计算处理器时间。

帮助

以下是可用帮助主题的列表及其简要总结:

描述
概述 本文。
INPUT 为了从 splat获取有用的输出,需要 AIX 跟踪挂钩。
名称 可使用什么名称的实用程序来使 splat 将地址映射到人类可读的符号。
报告 描述 splat 可产生的每个报告及用于计算报告值的公式。
排序 所有可用的排序选项及其如何应用于 splat 的输出。

Splat 跟踪

Splat 将通过 AIX 跟踪命令收集的 AIX 跟踪文件作为主输入。 用 splat 分析跟踪前,您需要确保跟踪是用一组适当的 hook 来收集的, 包括以下内容:
106 DISPATCH
10C DISPATCH IDLE PROCESS
10E RELOCK
112 LOCK
113 UNLOCK
134 HKWD_SYSC_EXECVE
139 HKWD_SYSC_FORK
419 CPU PREEMPT
465 HKWD_SYSC_CRTHREAD
46D WAIT LOCK
46E WAKEUP LOCK
606 HKWD_PTHREAD_COND
607 HKWD_PTHREAD_MUTEX
608 HKWD_PTHREAD_RWLOCK
609 HKWD_PTHREAD_GENERAL
由于在多处理器环境中使用锁的频率,捕获这些锁和解锁跟踪事件可能导致严重的性能下降。 因此,通常禁用锁跟踪事件报告。 为了启用锁跟踪事件报告,在收集包含 splat 需要的(KornShell 语法)锁跟踪事件的跟踪前,必须采用以下步骤:
  1. bosboot -ad /dev/hdisk0 -L

  2. shutdown -Fr

  3. (reboot the machine)

  4. locktrace -S

  5. mkdir temp.lib; cd temp.lib

  6. ln -s /usr/ccs/lib/perf/libpthreads.a

  7. export LIBPATH=$PWD:$LIBPATH
步骤 1 至 3 是可选的。 它们启用显示内核锁类名而非地址。 请参阅 bosboot(1) 以获取有关 bosboot 及其标志的更多信息。 步骤 5 至 7 对于激活用户 pthread 锁检测是必要的;temp.lib 子目录可以放置在任何位置。 为了完成报告,步骤 1 到步骤 7 是必需的。

Splat 名称

Splat 可以将 gensyms 的输出当作可选输入,并使用它将锁和函数地址映射为人类可读的符号。

锁类偏移量可以用来广泛地标识一个锁,但不像实际的符号那样特定地标识锁。

Splat 报告

splat 生成的报告包含报告摘要、锁摘要报告部分和锁详细信息报告的列表,每个报告都可能有相关的函数详细信息报告和/或线程详细信息报告。
Report Summary
^^^^^^^^^^^^^^
 The report summary consists of the following elements:

 - The trace command used to collect the trace.
 - The host that the trace was taken on.
 - The date that the trace was taken on.
 - The duration of the trace in seconds.
 - The estimated number of CPUs
 - The combined elapsed duration of the trace in seconds;
   ( the duration of the trace multiplied by the number of
     CPUs identified during the trace ).
 - Start time, which is the offset in seconds from the beginning of the
   trace that trace statistics begin to be gathered.
 - Stop time, which is the offset in seconds from the beginning of the
   trace that trace statistics stop being gathered.
 - Total number of acquisitions during the trace.
 - Acquisitions per second, which is computed by dividing
   the total number of lock acquisitions by the real-time
   duration of the trace.
 - % of Total Spin Time, this is the summation of all lock spin hold
   times, divided by the combined trace duration in seconds, divided by 100.
   The current goal is to have this value be less than 10% of the total
   trace duration.


 Lock Summary
 ^^^^^^^^^^^^

 The lock summary report has the following fields:

Lock		         The name, lockclass or address of the lock.


Type              The type of the lock, identified by one of the following letters: 
                      Q    A RunQ lock 
                      S    A simple kernel lock 
                      D    A disabled simple kernel lock
                      C    A complex kernel lock 
                      M    A PThread mutex 
                      V    A PThread condition-variable 
                      L    A PThread read/write lock
Acquisitions     The number of successful lock attempts for this lock, minus
                  the number of times a thread was preempted while holding
                  this lock.

Spins            The number of unsuccessful lock attempts for this lock,
                  minus the number of times a thread was undispatched while
                  spinning.

Wait             The number of unsuccessful lock attempts that
 or               resulted in the attempting thread going to
Transform         sleep to wait for the lock to become available,
                  or allocating a krlock.

%Miss            Spins divided by Acquisitions plus Spins, multiplied by 100.

%Total           Acquisitions divided by the total number of all
                  lock acquisitions, multiplied by 100.

Locks/CSec       Acquisitions divided by the combined elapsed
 	               duration in seconds.

Percent HoldTime 
Real CPU          The percent of combined elapsed trace time that
                  threads held the lock in question while dispatched.
                  DISPATCHED_HOLDTIME_IN_SECONDS divided by combined
                  trace duration, multiplied by 100.

Real Elaps(ed)    The percent of combined elapsed trace time that
                  threads held the lock while dispatched or sleeping.
                  UNDISPATCHED_AND_DISPATCHED_HOLDTIME_IN_SECONDS divided
                  by combined trace duration, multiplied by 100.

Comb Spin         The percent of combined elapsed trace time that
                  threads spun while waiting to acquire this lock.
                  SPIN_HOLDTIME_IN_SECONDS divided by combined trace
                  duration, multiplied by 100.

锁摘要报告缺省为十个锁的列表,按轮转占用时间百分比(第十个字段)的降序排序。 摘要报告的长度可用 -S 开关调整。 摘要报告(和所有其他报告)的排序顺序可用 -s 开关设置,其选项在 SORTING 帮助部分(splat -h 排序)中描述。
Lock Detail
^^^^^^^^^^^

 The lock detail report consists of the following fields:

 LOCK             The address (in hexadecimal) of the lock.

 NAME             The symbol mapping for that address (if available)

 CLASS            The lockclass name (if available) and hexadecimal offset,
                  used to allocate this lock ( lock_alloc() kernel service ).

Parent Thread    Thread id of the parent thread. This field only exists for Mutex, 
                  Read/Write lock and Conditional Variable report. 

 creation time    Elapsed time in seconds after the first event recorded in trace, 
                  if available. This field only exists for Mutex, Read/Write lock
                  and Conditional Variable report.
 deletion time    Elapsed time in seconds after the first event recorded in trace, 
                  if available. Tthis field only exists for Mutex, Read/Write lock
                  and Conditional Variable report. 

 Pid              Pid number associated to the lock (this field only exists for Mutex, 
                  Read/Write lock and Conditional Variable report). 

 Process Name    Process name associated to the lock (this field only exists for Mutex, 
                 Read/Write lock and Conditional Variable report). 

 Call-Chain      Stack of called methods (if possible to have them, this field only 
                 exists for Mutex, Read/Write lock and Conditional Variable report). 


 Acquisitions     The number of successful lock attempts for this lock.
                  This field is named Passes for the conditional variable lock report.

 Miss Rate        The number of unsuccessful lock attempts divided by
                  Acquisitions plus unsuccessful lock attempts, multiplied
                  by 100.

 Spin Count       The number of unsuccessful lock attempts.

 Wait Count       The number of unsuccessful lock attempts that resulted in
                  the attempting thread going to sleep to wait for the lock
                  to become available.

 Transform Count  The number of krlock allocated and deallocated by the simple lock. 

 Busy Count	      The number of simple_lock_try() calls that returned busy.

 Seconds Held
 CPU              The total time in seconds that this lock was held by
                  dispatched threads.

 Elapsed	         The total time in seconds that this lock was held by
                  both dispatched and undispatched threads.

NOTE: neither of these two values should exceed the
      total real elapsed duration of the trace.

 Percent Held
 Real CPU         The percent of combined elapsed trace time that
                  threads held the lock in question while dispatched.
                  DISPATCHED_HOLDTIME_IN_SECONDS divided by trace
                  duration, multiplied by 100.

 Real Elaps(ed)   The percent of combined elapsed trace time that
                  threads held the lock while dispatched or sleeping.
                  UNDISPATCHED_AND_DISPATCHED_HOLDTIME_IN_SECONDS divided
                  by trace duration, multiplied by 100.

 Comb Spin        The percent of combined elapsed trace time that
                  threads spun while waiting to acquire this lock.
                  SPIN_HOLDTIME_IN_SECONDS divided by trace duration,
                  multiplied by 100. 

 Wait             The percentage of combined elapsed trace time that 
                  threads unsuccessfully tried to acquire this lock.

 SpinQ            Splat keeps track of the minimum, maximum and average
                  depth of the spin queue (the threads spinning, waiting
                  for a lock to become available).

 WaitQ            As with the spin queue, splat also tracks the minimum,
                  maximum and average depth of the queue of threads waited
                  waiting for a lock to become available).

PROD              The associated krlocks prod calls count.

CONFER SELF       The confer to self calls count for the simple lock and the 
                  associated krlocks.

CONFER TARGET     The confer to target calls count for the simple lock and the 
                  associated krlocks. w/ preemption reports the successfull calls 
                  count, resulting in a preemption.

CONFER ALL        The confer to all calls count for the simple lock and the 
                  associated krlocks. w/ preemption reports the successfull calls 
                  count, resulting in a preemption.

HANDOFF           The associated krlocks handoff calls count.

Lock Activity w/Interrupts Enabled (mSecs)

锁详细报告的这部分是 splat 收集的每个锁的原始数据的转储,时间以毫秒表示。 五种状态:锁、轮转、等待、未分派(atched)和占先是 splat 的已启用的 simple_lock 有限状态机的五种基本状态。 每一种状态的计数是导致转换为该状态的线程的操作次数。 以毫秒为单位的持续时间显示锁请求在此状态耗费的最小时间、最大时间及时间的总量。
   LOCK:    this state represents a thread successfully acquiring a lock.

   SPIN:    this state represents a thread unsuccessfully trying to acquire
            a lock.

   WAIT:    this state represents a spinning thread (in SPIN) going to sleep
            (voluntarily) after exceeding the thread's spin threshold.

   UNDISP:  this state represents a spinning thread (in SPIN) becoming
            undispatched (involuntarily) before exceeding the thread's
            spin threshold.

   PREEMPT: this state represents when a thread holding a lock is
            undispatched.


Lock Activity w/Interrupts Disabled (mSecs)
锁详细报告的这部分是 splat 收集的每个锁的原始数据的转储,时间以毫秒表示。 六种状态:锁、轮转、带 KRLOCK 的锁、KRLOCK 锁、KRLOCK 轮转和转换是 splat 的已禁用的 simple_lock 有限状态机的六种基本状态。 每一种状态的计数是导致转换为该状态的线程的操作次数。 以毫秒为单位的持续时间显示锁请求在此状态耗费的最小时间、最大时间及时间的总量。
LOCK:        This state represents a thread successfully acquiring a lock.

SPIN:        This state represents a thread unsuccessfully trying to acquire
             a lock.

LOCK with    The thread has successfully acquired the lock, while holding 
 KRLOCK:     the associated krlock, and is currently executing.

KRLOCK LOCK: The thread has successfully acquired the associated krlock, 
             and is currently executing.

KRLOCK SPIN: The thread is executing and unsuccessfully attempting to acquire
             the associated krlock. 

TRANSFORM:   The thread has successfully allocated a krlock it associates to, 
             and is executing.


Function Detail
^^^^^^^^^^^^^^^

 The function detail report consists of the following fields:

 Function Name   The name or return address of the function which
                 used the lock.

 Acquisitions	   The number of successful lock attempts for this lock.
                 For complex lock and read/write lock there is a 
                 distinction between acquisition for writing 
                 (Acquisition Write) and for reading 
                 (Acquisition Read).


 Miss Rate      The number of unsuccessful lock attempts divided by
                Acquisitions, multiplied by 100.

 Spin Count     The number of unsuccessful lock attempts.
                For complex lock and read/write lock there is a 
                distinction between spin count for writing 
                (Spin Count Write) and for reading 
                (Spin Count Read).

 Wait Count     The number of unsuccessful lock attempts that resulted in
                the attempting thread going to sleep to wait for the lock
                to become available.
                For complex lock and read/write lock there is a 
                distinction between wait count for writing 
                (Wait Count Write) and for reading 
                (Wait Count Read).

Transform Count The number of times that a simple lock has allocated a krlock,
                while the thread was trying to acquire the simple lock.

 Busy Count	     The number of simple_lock_try() calls that returned busy.

 Percent Held of Total Time 
 CPU              The percent of combined elapsed trace time that
                  threads held the lock in question while dispatched.
                  DISPATCHED_HOLDTIME_IN_SECONDS divided by trace
                  duration, multiplied by 100.

 Elaps(ed)	      The percent of combined elapsed trace time that
                  threads held the lock while dispatched or sleeping.
                  UNDISPATCHED_AND_DISPATCHED_HOLDTIME_IN_SECONDS divided
                  by trace duration, multiplied by 100.

 Spin             The percent of combined elapsed trace time that
                  threads spun while waiting to acquire this lock.
                  SPIN_HOLDTIME_IN_SECONDS divided by combined trace
                  duration, multiplied by 100.

 Wait             The percentage of combined elapsed trace time that 
                  threads unsuccessfully tried to acquire this lock. 

 Return Address   The calling function's return address in hexadecimal.

 Start Address	   The start address of the calling function in hexadecimal.

 Offset           The offset from the function start address in hexadecimal.


Thread Detail
^^^^^^^^^^^^^

The thread detail report consists of the following fields:

 ThreadID        Thread identifier.

 Acquisitions    The number of successful lock attempts for this lock.

 Miss Rate       The number of unsuccessful lock attempts divided by
                 Acquisitions, multiplied by 100.

 Spin Count      The number of unsuccessful lock attempts.

 Wait Count      The number of unsuccessful lock attempts that resulted in
                 the attempting thread going to sleep to wait for the lock
                 to become available.

Transform Count The number of times that a simple lock has allocated a krlock,
                while the thread was trying to acquire the simple lock.

Busy Count      The number of simple_lock_try() calls that returned busy.

 Percent Held of Total Time 
 CPU              The percent of combined elapsed trace time that
                  threads held the lock in question while dispatched.
                  DISPATCHED_HOLDTIME_IN_SECONDS divided by trace
                  duration, multiplied by 100.

 Elaps(ed)        The percent of combined elapsed trace time that
                  threads held the lock while dispatched or sleeping.
                  UNDISPATCHED_AND_DISPATCHED_HOLDTIME_IN_SECONDS divided
                  by trace duration, multiplied by 100.

 Spin             The percent of combined elapsed trace time that
                  threads spun while waiting to acquire this lock.
                  SPIN_HOLDTIME_IN_SECONDS divided by combined trace
                  duration, multiplied by 100. 

 Wait             The percent of combined elapsed trace time that 
                  threads unsuccessfully tried to acquire this lock. 

 ProcessID        Process identifier (only for SIMPLE and COMPLEX Lock report). 

 Process Name     Name of the process (only for SIMPLE and COMPLEX Lock report).

Splat 排序

splat 允许用户指定使用哪个条件、使用 -s 选项对摘要和锁详细信息报告排序。 缺省排序条件是按轮转占用时间百分比排序,这是线程用于锁轮转的时间与总的跟踪持续时间的比率。 使用 -s,排序条件可变为以下值:
描述
a 获取;线程成功获取锁的次数。
c 处理器占用时间百分比;处理器占用时间与总的跟踪持续时间的比率。
e 已用占用时间百分比;已用占用时间与总的跟踪持续时间的比率。
L 位置;锁或函数的地址,或线程的标识。
m 错失率;错失的锁定尝试次数与获取数的比率。
转数计数;导致线程轮转等待该锁的锁定尝试失败的次数。
处理器轮转占用时间百分比(缺省)。
w 已用等待时间百分比;非零数量的线程等待锁的总时间百分比。
W 平均等待队列深度;等待锁的线程平均数,相当于每个等待线程在此状态耗用的平均时间。

splat 将用指定的条件对锁报告按降序排序。

限制

不分析其他类型的锁,如 VMM、XMAP 和一些特定于 Java 的锁。

文件

描述
/etc/bin/splat 简单性能锁分析工具(splat)。 提供内核和 pthread 锁使用报告。