• 1 reply
  • Latest Post - ‏2011-09-25T19:45:43Z by nagger
1 Post

Pinned topic High CPU usage due to page faults

‏2011-09-19T15:26:53Z |
I have a LPAR with AIX 6100-05-03-1036 on P570 hardware.
We have a problem of high system CPU usage related with page faults,as this shows:
Memory │ Physical PageSpace | pages/sec In Out FileSystemCache │
% Used 81.4% 0.7% | to Paging Space 0.0 0.0 | (numperm) 18.4% │
% Free 18.6% 99.3% | to File System 0.5 45.5 | Process 26.8% │
MB Used 11663.1MB 30.1MB | Page Scans 0.0 | System 36.1% │
MB Free 2672.9MB 4065.9MB | Page Cycles 0.0 | Free 18.6% │
Total(MB)14336.0MB 4096.0MB | Page Steals 0.0 |

│ | Page Faults 21685.7 | Total 100.0% │
│------------------------------------------------------------ | numclient 18.4%

The filesystem cache is not the problem,because as you can see there is little filesystem cache activity.
The M otion in nmon shows:

│--- Below are Rates per Second --- 4KB 16MB │
│ numpermio non-w.s. pageouts 0 0 │
│ pgexct Page Faults 63212 0 │
│ pgrclm Page Reclaims 0 0 │
│ pageins Paged in -All 30 0 │
│ pageouts Paged out -All 220 0 │
│ pgspgins Paged in -PageSpace 0 0 │
│ pgspgouts Paged out-PageSpace 0 0 │
│ numsios I/O Started 250 0 │
│ numiodone I/O Completed 81 0 │
│ zerofills Zero filled 21563 0 │
│ exfills Exec() filled 37 0 │
│ scans Scans by clock 0 0 │
│ cycles Clock hand cycles 0 0 │
│ pgsteals Page Steals 0 0

so there is lots of "pgexct Page Faults" and "zerofills Zero filled" faults.This machine has a WAS server which also show page faults,but all java processes page faults on each moment does not sum up to the big amout shown here.
Could somebody explain me what does this kind of page faults mean?.I´d like to know if there is some problem in java settings,producing this problem,and what more could I use to discover the origin of such big amount of page faults.
Updated on 2011-09-25T19:45:43Z at 2011-09-25T19:45:43Z by nagger
  • nagger
    1698 Posts

    Re: High CPU usage due to page faults

    You have free memory so we are not in a heavy paging out to make room for more memory problem.
    Zero fill is what happens when an application asked (malloc() system call) for more memory for its heap (scratch pad memory).
    It is zero filled to remove the previous content for security reasons, so a program can't accidentally get a page with the contents of the last process using it. This is not paging in or out to disk.

    This is what normally happens when a process starts and as WAS is a large memory Java process it is very common. If this system is recently started, do not worry. Also happens as a machine is accepting lots more work - for example, as online users arrive in the morning - it should steady down.

    Also you do not state the size of the machine - I use a rule of thumb based on the number of processors in the machine - on large POWER7 or POWER6 machines quite high paging rates to disk but you do need to spread the paging space across lots of disks - either physically or via (of example) SAN disks with many underlying disks to a RAID5 LUN and even better with plenty of caching.

    Thanks, Nigel griffiths