Evaluate performance for Linux on POWER

Analyze performance using Linux tools

Return to article

Note:

  • White boxes are specific POWER7 PCMs watched in a profile: Completion Stall Cycles <C>, Stall by FXY <C2>, FXU Multi-cycle Instruction <C2A>, Stall by Scalar <C3C>, Stall by Scalar Long <C3C1>, Stall by Vector <C3B>, Stall by Vector Long <C3B1>, Stall by DFU <C3A, Stall by LSU <C1>, Stall by Reject <C1A>, Translation Stall <C1A1>, Other Reject <C1A2>, Stall by D-cache Miss <C1B>, Stall Store <C1B>, Stall due SMT <C4>, Stall due IFU <C5>, Stall due BRU <C5A>, GCT Empty Cycles <B>, GCT Empty due lcache Miss <B1>, GCT Empty due branch Mispredict <B2>, GCT Empty due branch Mispredict and lcache Miss <B3>, Completion Cycles <A>, Base COmpletion Cycles <A1>
  • Gray boxes [marked with an asterisk (*)] are calculated (these metrics have no specific hardware counters): Stall by VSU <C3>, FXU Other, Stall by Scalar Other <C3C2>, Stall by Vector Other <C3B2>, LSU Other <C1D>, Other IFU Stall <C5B>, Other Stall <C6>, GCT Empty Other, Overhard of expansion

(Print using landscape format.)

Table 1. Partial POWER 7 CBM
Column 1Column 2Column 3Column 4Column 5
Cycles(PM_RUN_CYC) Completion Stall Cycles <C>
(PM_CMPLU_STALL)
Stall by FXY <C2>
(PM_CMPLU_STALL_FXU)
FXU Multi-cycle Instruction <C2A>
(PM_CMPLU_STALL_DIV)
FXU Other *
(C2-C2A)
(PM_CMPLU_STALL_FXU_OTHER)
Stall by VSU <C3> *
(C3A + C3B + C3C)
(PM_CMPLU_STALL_VSU)
Stall by Scalar <C3C>
(PM_CMPLU_STALL_SCALAR)
Stall by Scalar Long <C3C1>
(PM_CMPLU_STALL_SCALAR_LONG)
Stall by Scalar Other <C3C2> *
(C3C - C3C1)
(PM_CMPLU_STALL_SCALAR_OTHER)
Stall by Vector <C3B>
(PM_CMPLU_STALL_VECTOR)
Stall by Vector Long <C3B1>
(PM_CMPLU_STALL_VECTOR_LONG)
Stall by Vector Other <C3B2> *
(C3B - C3B1)
(PM_CMPLU_STALL_VECTOR_OTHER)
Stall by DFU <C3A>
(PM_CMPLU_STALL_DFP)
Stall by LSU <C1>
(PM_CMPLU_STALL_LSU)
Stall by Reject <C1A>
(PM_CMPLU_STALL_REJECT)
Translation Stall <C1A1>
(PM_CMPLU_STALL_ERAT_MISS)
Other Reject <C1A2>
(C1A - C1A1)
(PM_CMPLU_STALL_ERAT_OTHER)
Stall by D-cache Miss <C1B>
(PM_CMPLU_STALL_DCACHE_MISS)
Stall Store <C1B>
(PM_CMPLU_STALL_STORE)
LSU Other <C1D> *
(C - C1A - C1B - C1C)
(PM_CMPLU_STALL_LSU_OTHER)
Stall due SMT <C4>
(PM_CMPLU_STALL_THRD)
Stall due IFU <C5>
(PM_CMPLU_STALL_IFU)
Stall due BRU <C5A>
(PM_CMPLU_STALL_BRU)
Other IFU Stall <C5B> *
(C5 - C5A)
(PM_CMPLU_STALL_IFU_OTHER)
Other Stall <C6> *
(C - C1 - C2 - C3 - C4 - C5)
(PM_CMPLU_STALL_OTHER)
GCT Empty Cycles <B>
(PM_GCT_NOSLOT_CYC)
GCT Empty due lcache Miss <B1>
(PM_GCT_NOSLOT_IC_MISS)
GCT Empty due branch Mispredict <B2>
(PM_GCT_NOSLOT_BR_MPRED)
GCT Empty due branch Mispredict and lcache Miss <B3>
(PM_GCT_NOSLOT_BR_MPRED_IC_MISS)
GCT Empty Other *
(B1 - B1 - B2 - B3)
(PM_GCT_EMPTY_OTHER)
Completion Cycles <A>
(PM_GRP_CMPL)
Base COmpletion Cycles <A1>
(PM_1PLUS_PPC_CML)
Overhard of expansion *
(A-A1)

Return to article