Question & Answer
Question
How to interpret the resource usage metrics in a Netezza / PureData for Analytics plan file ?
Answer
Understanding resource usage metrics in Netezza plan file
Understanding the resource usage characteristics of a plan is an important aspect to performance testing or performance troubleshooting. Plan resource metrics are captured with granularity to enable great insight into why a plan performs as it does. These metrics are made available to the user in multiple ways. First, the plan file itself will display snippet level metrics at the completion of each snippet and a resource usage summary when the plan is complete. A second way to access a plan’s resource utilization is with the _V_PLAN_RESOURCE system view. This data is temporal, holding the most recent 1500 plans, but it can be persisted using the Software Support Toolkit script /nz/support/bin/nz_query_history along with its partner script /nz/support/bin/nz_query_history.ddl to create the required objects. View the help output of this script (/nz/support/bin/nz_query_history –h) for more details. Persisting this Plan Resource data is highly recommended as you will find all kinds of interesting and valuable ways to mine this data.
Location of the Plan files:
Plan files are stored in the /nz/kit/log/planshist directory. The number of plans kept is configurable with system registry variables:
host.planHistFiles (default=2000)
host.planHistKeepArchives (default=10)
The system will keep 10 plan archives (.tgz files) each holding 2000 plan files based on the default setting. Increasing this to allow for a longer history can often be useful when needing to compare plans for the same query on different days. For instance, a query ran well last week, but is very slow today. It would be good to compare these two plan files. Do be careful in increasing host.planHistFiles above 5000 as this setting does impact allocations from the shared memory segment. Setting host.planHistFiles to a high level in combination with a flurry of large (several hundred snippet) plan files can lead to temporary memory constraint issues. Increasing host.planHistKeepArchives only has the negative impact of consuming a relatively insignificant amount of additional disk in the /nz filesystem.
Viewing the Plan file:
You must be aware of the methods used to display the plan file of interest.
Via SQL:
nzsql –c “SHOW PLANFILE <planid> ;”
--- this will display the raw plan file as it is stored in the archives.
Via Software Support Toolkit script:
/nz/support/bin/nz_plan <planid>
--- this script enhances the raw plan with additional information. The enhanced information is quite useful and thus this is the recommended method. Providing the additional information does incur extra processing that can be costly in terms of time and resource utilization. If time or resource contention is a problem, you can still use this script with the –fast option to get all but the high cost information.
/nz/support/bin/nz_plan –fast <planid>
Snippet Level Compilation Information:
Each snippet will produce several lines, showing details of its execution. The first line(s) will show information about the object code that will be used for the snippet. If the snippet is SPU or Host only, there will only be one line. If the snippet runs on both SPU and Host, there will be a line for each.
-- Object "/nz/data.1.0/plans/1434978/s1434978_3.o" compiled dur 0.060 complexity 2151
-- Object "/nz/data.1.0/plans/1434978/h1434978_503.o" from cache "/nz/data.1.0/cache/155/60o"
In the example above, you can see reference to 1434978_3. This indicates plan ID = 1434978 and snippet = 3.
The “Object” line is compilation information. The first Object line is for a SPU level snippet as indicated by the plan suffix of _3. The plan suffix for a Host level snippet will be in the 500 range. If not above 500, the information will represent a SPU level snippet. Notice the second “Object” line is for Host level snippet as indicated by the plan suffix of _503. These lines show if a compilation is needed for this snippet or if the object code can be found already in cache.
In this case, for the SPU level snippet, a compilation is needed and shows the compile time duration and a relative level of complexity.
-- Object "/nz/data.1.0/plans/1434978/s1434978_3.o" compiled dur 0.060 complexity 2151
In this case, for the Host level snippet, compilation is not needed. An object file in cache is used.
-- Object "/nz/data.1.0/plans/1434978/h1434978_503.o" from cache "/nz/data.1.0/cache/155/60o"
Snippet Level Execution Information:
Following the Object line(s) will be information about the snippet’s execution timing and resource utilization.
-- Snippet 1434978_3 queue 0.001 setup 0.002 exec 0.246 @ 30% res 0.080 channels 2 hmem 3000 mem 2000 estrows 504 sqb exp h/s 1%/3%
-- Detail 1434978_3 table 1 scan 0/0.29/1 hits 0/0.29/1 zmread 1/1.86/2 zmhits 1/1.86/2 pz 0%/29.43%/100% #565 trans 1 trows 0/4 hcpu 0.078 cpu 0.000/0.020 dread 0.003 mem 5 temp 1
The first line, --Snippet, generally describes overall timing. This will show a set of values in order. Only non-zero values will appear in the line.
Field | Unit | Description |
| queue | Sec | Time spent in the snippet queue |
| setup | Sec | Time spent in snippet setup |
| exec | Sec | Time spent in snippet execution |
| @ | % | Average resource allocation during execution |
| ceil | Sec | Time spent in ceiling delay |
| delay | Sec | Time spent in greed delay |
| res | Sec | Total system resource used |
| channels | Channels | Total number of channels |
| hmem | Blocks | Estimated host memory blocks |
| mem | Blocks | Estimated SPU blocks |
| estrows | Rows | Estimated number of rows (per dataslice) |
| rslice | Bool | True if the planner expects most dataslices to have data |
| exp h/s | % | Amount of host/spu resources expected/guessed the snippet would use |
| sqb | - | Shown if snippet is under Short Query Bias governance |
| misfit-s | - | Shown if snippet waited for SPU memory |
| misfit-h | - | Shown if snippet waited for host memory |
| misfit-c | - | Shown if snippet waited for communication channels |
| misfit-g | - | Shown if snippet waited for group memory |
| misfit-d | - | Shown if snippet waited for spu2spu distribution channel |
| misfit-b | - | Shown if snippet waited for broadcast channel |
| misfit-r | - | Shown if snippet waited for return channel |
The second line, --Detail, focuses on the resource usage. Only values will be shown if applicable. For instance, if a table is not scanned, the “table” and “scan” values will not appear.
For “table”, “scan”, and “pz” <min>/<average>/<max> is shown. If all values are equal, only one value will be shown. For SPU resources, <average>/<max> is shown. If they match each other, only one value will be shown.
Field | Locus | Unit | Description |
| table | SPU | Blocks | Min/Avg/Max number of real (not transient) table blocks |
| scan | SPU | Blocks | Min/Avg/Max number of blocks that survived zonemap filter |
| pz | SPU | % | Min/Avg/Max of pages for this snippet where the extent level zonemap information was overruled by the page level information. This is a measure of page-level zonemap benefit. |
| hits | SPU | Blocks | Table cache hits. (scan hits = actual disk reads) |
| zmread | SPU | Blocks | Number of zonemap pages read |
| zmhits | SPU | Blocks | Number of zonemap page reads that were satisfied by cache |
| # | SPU | - | Number of dataslices that returned data |
| = | SPU | - | If only one dataslice had data, this shows its ID |
| trans | SPU | Blocks | Avg/Max number of blocks written to transient |
| trows | SPU | Rows | Avg/Max number of rows written to transient |
| hcpu | Host | Sec | Host CPU time |
| hread | Host | Sec | Host read time * |
| hwrite | Host | Sec | Host write time * |
| hfabric | Host | Sec | Host network time (internal network) * |
| cpu | SPU | Sec | Avg/Max SPU CPU time |
| dread | SPU | Sec | Avg/Max SPU data disk read time * |
| dwrite | SPU | Sec | Avg/Max SPU data disk write time * |
| fpga | SPU | Sec | FPGA time used to process disk scans |
| tread | SPU | Sec | Avg/Max SPU temporary disk read time * |
| twrite | SPU | Sec | Avg/Max SPU temporary disk write time * |
| fabric | SPU | Sec | Avg/Max internal network time * |
| hmem | Host | Blocks | Memory used in blocks |
| mem | SPU | Blocks | Avg/Max SPU memory in blocks |
| temp | SPU | Blocks | The dataslice with the largest number of temp pages in use |
| null | Host | Sec | “Null” I/O time spent waiting for I/O on external sources |
| spuev# | SPU | - | Number of SPU event processes that ran for this snippet |
| Spuev= | SPU | - | ID of SPU event process that ran (if only a single event process) |
* Disk and fabric times are calculated by multiplying number of bytes X average rate.
The “table” and “scan” information helps illustrate the how much data is being filtered out. Disparities between Max and Average can identify data skew.
The “pz” field helps identify if page granular zone maps are present and working.
The “#” field can be a strong indicator of distribute restrict optimization effectiveness.
The "tread" and "twrite" fields are not reliable indications of transient size. They represent actual I/O to nzlocal for any purpose, including sort spill files, DHJ partitions, etc. Normally temporary file data is written to and read from the Linux file cache. That time is not recorded. Only the bytes actually flushed or read from nzlocal is recorded, and then converted to time using an average rate.
When troubleshooting a performance problem with a “slow” plan. One common task is to compare these metrics to the metrics of a version of this plan when it performed as expected. This can help shed light as to where the real differences are.
Snippet skipped thanks to Snippet Result Cache:
When a snippet is skipped because its result can be provided by the Snippet Result Cache, there is no need for an object file and the execution information described above. In this case, a line similar to this will appear.
-- Snippet 1434978_3 recon 0.005 @ 0% exp h/s 0%/3% created 2166871853 from cached 2166871507 (1432451_12)
Plan Summary Information:
----------------
Time Duration
Job Submit 2016-12-12 17:16:10.175179 EST 2.032 (prior plans subtotal)
Plan Submit 2016-12-12 17:16:12.207442 EST 0.041 (plan waiting)
Plan Start 2016-12-12 17:16:12.248020 EST 12.990 (plan execution)
Plan Finish 2016-12-12 17:16:25.238290 EST 13.031 (plan total)
Job Finish 2016-12-12 17:16:25.238290 EST 15.063 (job total)
----------------
Estimate: Planner 5.118 Prep 0.337 Scheduler 0.337 Hmem 296 Mem 2000 Channels 2 SQB Exp 3%
Actual: Parse 0.017 Rewrite 28.225 Plan 2.151 TXQ 0.032 Prep 0.008 SNQ 0.025 Setup 0.065 Recon 0.062 Exec 4.203 @ 61% Cache 0.021 Res 10.845 Channels 1 Mem 141 Temp 13 Rows 8 Bytes 5.7K
User: tx 0x32f844c cli 83 uid 57262 sid 277307 pid [35667] group 2309617/RSG_AOD_PROD_LOW
Resource: Hcpu 9.367 Cpu 1.044/1.220 Dread 0.433/0.459 Fpga 0.151/0.159 Fabric 0.003
If there are prior plans (JIT scans, prebroadcast, etc.) for this query/Job, there will be included a Job Submit line before the Plan Submit line and a Job Finish line after the Plan Finish line. If there are no prior plans, these Job lines will be omitted.
The resource line will be condensed using the same algorithms described above for per-snippet resources. That is, skipping non-zero fields, showing max only if different than average, etc. In general, the resource metrics at the Snippet level are the more valuable for analysis.
Note that the fields have an initial capital letter. This distinguishes them from snippet resources.
Was this topic helpful?
Document Information
Modified date:
17 October 2019
UID
swg21995797