IBM Support

Stress test your AIX or Linux server with nstress

How To


Summary

nstress is a set of programs to keep many parts of the computer busy including CPU, memory, file systems, inter-process communications and disks.

Objective

Nigels Banner

Possible uses of the programs in the nstress toolbox are:

  • You may want to "burn in" new hardware to prove it is reliable before production use.

  • You may want to find out how fast you computer runs like memory speeds or disk I/O.

  • You may want to generate "fake" workloads and then use performance monitoring tools like nmon or njmon to see the performance stats in action.

Environment

AIX or Linux and no other users on you computer (or VM) because nstress can grind your computer to a stop, if you over do the workloads.
man sweating over his computer
While I focus on POWER based Systems these tools also are compiled for AMD64 and x86_64

Steps

There are a number of uses:

  • Soak testing = check a new machine/disk to remove early life failures
  • Prove performance of machine upgrades or alternative disk configurations
  • Learn performance monitoring and tuning
  • For example I run a Performance Tuning Master Class and need to quick set up many different workloads and problems to be solved. With a 20 line shell script and these tools I don't have to spend a week of setup time.

 The programs are: 

Name Purpose
ncpu hammers the CPUs (can be slowed down to use a percentage)
ndisk removed - use ndisk64
ndisk64 hammers the disk(s) compiled for large files so it can access very large files (many GBs)
ndiskaio removed - use ndisk64
ndiskmio removed - if you need this ask. Uses the Modular IO AIX Expansion pack library
it is assumed this is installed (experimental not currently available)
nmem hammers or touches memory
nmem64 hammers or touches memory - complied 64 bit so it have access large memory (many GBs)
nipc tests shared memory, semaphores and shared messages in a ring of processes - takes 1 CPU
nlog generates output like error messages, you specify number of KB output per second
nfile creates, writes and deletes files to push the JFS and JFS log very hard
createfs.sh Example script to create the filesystems used by the below scripts - you will need to edit this for your system
dbstart.sh Example script to start a fake database RDBMS - you will need to edit this for your system
webstart.sh Example script to start a fake web server - you will need to edit this for your system

Download the programs:

  • These tar file contains all the binaries and scripts.
  • AIX version:
  • Linux on POWER version:
  • Linux on AMD64 (called x86_64 by Intel) version:
  • Warnings:
    • Do not consider these as benchmark programs - they are hardware stressing tools.
    • Do not compare AIX with Linux - it is different code and different compilers. Especially ndisk64.
    • Do not compare Linux ppc64 and Linux x86_64 - I did not install the Advanced Tool Chain on ppc64 to get the optimised compiler which would give up to 35% better performance.
    • Do not compare Linux with Linux - due to the different ages of the OS, different kernel levels, different libc, different GCC it is NOT a fair comparison. Also note some Distributions were NOT updated from current Online repositories and are the original Gold DVD level RPMs (RHEL and SLES) as I don't have repository access.
    • Special ndisk64_75_linux note:
      • Always use the -M procs option to specify the number of processes to use or it will hang
      • Recommend using the -s size option to specify the file size
      • The Async I/O is completely missing in the code - don't believe the -? help!
      • The -C for CIO and -D for DIO are also missing too - anyone know the options in Linux for the open system call? Use Direct I/O by using the /dev/rsda9 type volume but make sure its NOT a filesystem or it will trash your files in a nano-second.
  • The Source Code is not available to the public.
  • If you want a version to run on Linux please ask and I will see what we can do.

Warranty = none

  • Its strictly at your own risk.
  • I have hung an entire server more than once !
  • When experimenting use the option that will stop the tool in say 5 minutes otherwise you may have to halt your server or VM the hard way.
  • If you run these as a regular user (except make you system busy) - no harm can be done but as the root user they can be dangerous and even hang the machine due to total saturation of CPU or memory or disk I/O.

Warning

  • Note: ncpu running as root will try to boost its priority.
  • This will effectively lock out an entire CPU(s) (if the slowing down options are not used).
  • Which can be a good thing, if that is what you want - this effectively removes the CPU from your configuration. These days dynamic LPAR changes are a safer way.

Additional Information

Note: Most commands now have the following options

Rename yourself option: -o

  • This allows the process to rename itself so that it looks like something else when it is running.
  • For example, a database. * This HAS to be the LAST option on the line.
  • This also fools programs like ps and nmon.
  • I call the masquerading.

Snooze mode: -z

  • This cause the program to not run flat out and so behave like a user driven command.
  • This is performed at the millisecond level and is quick effective.

Second to run and then stop: -s

  • This specifies the maximum time in second to run.
  • ALWAYS use this to make sure the programs stop, otherwise you can slow down a machine forever (or reboot), also if you "over cook" the workload by starting too many programs and the machine stops responding - you can at least just wait for the timeout for recovery to happen (better than a reboot).

Manual pages = the help output of the programs

ncpu -h output
Usage: ncpu version 9.0 hammers the cpu(s)
Note: root users get a priority boost = effectively removes the CPU(s)

Hint hammer CPU mode: ncpu -p procs [-z percent] [-s secs] [-h secs] [-o "cmd"]
        -p procs   = number of copies of cpu to start (max=256)
        -z percent = Snooze percent - time spent sleeping (default 0)
        -s seconds = Seconds maximum run time (default no limit)
        -h seconds = Seconds to sleep after each second of run time
        -o "cmd" = Other command - pretend to be this other cmd when running
                Must be the last option on the line

Hints

Run in CPU counter mode:   ncpu -c

Run on eight CPUs but not use the last 20% of the CPUS, stop after 15 minutes and add jitter and the ps command will report the processes as DB2
  • ncpu -p 8 -z 20 -s 900 -h1 -o DB2inst

ndisk64 -h output

ndisk64 version 7.4
Complex Disk tests - sequential or random read and write mixture
ndisk64 -S          Sequential Disk I/O test (file or raw device)
        -R          Random    Disk I/O test (file or raw device)

        -t <secs>   Timed duration of the test in seconds (default 5)

        -f <file>   use "File" for disk I/O (can be a file or raw device)
        -f <list>   list of filenames to use (max 16) [separators :,+]
                        example: -f f1,f2,f3  or -f /dev/rlv1:/dev/rlv2
        -F <file>   <file> contains list of filenames, one per line (upto 2047 files)
        -M <num>    Multiple processes used to generate I/O
        -s <size>   file Size, use with K, M or G (mandatory for raw device)
                        examples: -s 1024K   or   -s 256M   or   -s 4G
                        The default is 32MB
        -r <read%> Read percent min=0,max=100 (default 80 =80%read+20%write)
                        example -r 50 (-r 0 = write only, -r 100 = read only)
        -b <size>   Block size, use with K, M or G (default 4KB)
        -O <size>   first byte offset use with K, M or G (times by proc#)
        -b <list>   or use a colon separated list of block sizes (31 max sizes)
                        example -b 512:1k:2K:8k:1M:2m
        -q          flush file to disk after each write (fsync())
        -Q          flush file to disk via open() O_SYNC flag
        -i <MB>     Use shared memory for I/O MB is the size(max=256 MB)
        -v          Verbose mode = gives extra stats but slower
        -l          Logging disk I/O mode = see *.log but slower still
        -o "cmd"  Other command - pretend to be this other cmd when running
                        Must be the last option on the line
        -K num      Shared memory key (default 0xdeadbeef) allows multiple programs
                    Note: if you kill ndisk,  you may have a shared memory
                    segment left over. Use ipcs and then ipcrm to remove it.
        -p          Pure = each Sequential thread does read or write not both
        -P file     Pure with separate file for writers
        -C          Open files in Concurrent I/O mode O_CIO
        -D          Open files in Direct I/O mode O_DIRECT
        -z percent  Snooze percent - time spent sleeping (default 0)
To make a file use dd, for 8 GB: dd if=/dev/zero of=myfile bs=1M count=8196

Asynchronous I/O tests (AIO) This now uses POSIX threads and  Async I/O API
        -A         switch on Async I/O use: -S/-R, -f/-F and -r, -M, -s, -b, -C, -D to determine I/O types
                (JFS file or raw device)
        -x <min>   minimum outstanding Async I/Os (default=1, min=1 and min<max)
        -X <max>   maximum outstanding Async I/Os (default=8, max=1024)
        see above -f <file>  -s <size>   -R <read%>  -b <size>

For example:
        dd if=/dev/zero of=bigfile bs=1m count=1024
        ndisk64 -f bigfile -S -r100 -b 4096:8k:64k:1m -t 600
        ndisk64 -f bigfile -R -r75 -b 4096:8k:64k:1m -q
        ndisk64 -F filelist -R -r75 -b 4096:8k:64k:1m -M 16
        ndisk64 -F filelist -R -r75 -b 4096:8k:64k:1m -M 16 -l -v

For example:
        ndisk64 for Asynch compiled in version
        ndisk64 -A -F filelist -R -r50 -b 4096:8k:64k:1m -M 16 -x 8 -X 64


nmem64 -h output

$ ./nmem64
Usage: nmem version 6.1
Hint: nstress tool to hammer memory (do not use this on production machines)
 use 1: mallocs memory and then touches the memory pages at random
 use 2: cycles though memory speed test to determine/guess cache sizes (-c option)
 output includes memory size used, operations performed, time taken and ops per second

        nmem -m Mbytes [-s MaxSeconds] [-z percent][-o "cmd"]
                Mbytes     = Size of RAM to use in mega-bytes
                             For nmem the max is 255 (~256MB) and for nmem64 the max is 2047 (~2GB))
                MaxSeconds = maximum time of the test.
                        Use this to halt nmem even if you drove your OS to a stand still.
                        You have been warned!
                percent    = Snooze percent - nmem sleeps for the percentage of the time
                cmd        = nmem with pretend to be a process called cmd.
                             This option must be the last on the line in double quotes


        Memory speed test with increasing memory sizes
        - may highlight CPU cache sizes

        nmem -c [-s MaxSeconds]

                MaxSeconds = maximum time of the test (default 60)
Example:
        nmem -m 250 -s 300               - grab and randomly touch 256 MB of memory for 5 minutes
        nmem -m 250 -s 300 -z 80         - as above but slower, sleep 80% of the time
        nmem -c -s 600                   - cycle through tests (maximum of 10 minutes)
        nmem -m 6 -o "sally -x"          - Use 6 MB pretend to be process sally with parameter -x

If your OS complains about the size of memory use you hit your ulimit
         For 256MB+ try: ulimit -d unlimited


nlog -h output

./nlog -h
Usage: nlog version 3.0
Hint: generates a log file at a steady rate to standard error
        nlog -k Kbytes -s Seconds -m MaxSeconds -o "cmd"
        nlog -k 1 -s 60                 is the default
Example:
        nlog -k 2 -s 1 -m 600            - 2 K per second for 10 minutes
        nlog -k 2 -s 1                   - 2 K per second forever
        nlog -k 2 -s 1 -o "bert -x"      - 2 K per second pretend to be process bert with parameter -x
        nlog -k 1 -s 60 -m 300           - 1 K per minute for 5 minutes
        nlog -k 200 -s 1 -m 3600         - 200 K per second for an hour


nipc -h output

This runs processes that communicate with shared memory with control via a semaphore and then back again with a message queue.

Usage: nipc version 2.0
nipc: hammers inter-process communication (IPC) 
    that is shared memory, semaphores and message queues

hint: nipc -p procs -s seconds

nfile -h output

Usage: nfile version 2.0
Hint: creates and deletes files = generates JFS log file work
nfile -d directory [-k Kbytes] [-c Files] [-m MaxSeconds] [-z percent] [-o "cmd"]
        -d directory  - top level directory
        -k Kbytes     - size of the files to create in KB (default 4KB)
        -c Files      - number of files to maintain (+/- 10) (default 4096)
        -m MaxSeconds - stop of so many seconds
        -z percent    - percent of time to sleep/snooze
        -o "cmd"    - pretend to be a different cmd (must be the last option)
Example:
        nfile -d mydir -k 1 -c 10000 -m 600
        nfile -d /tmp/files -k 64 -o "bert -x"

If you find errors or have question, email me: 

  • Subject: nstress
  • E-mail: n a g @ u k . i b m . c o m  

Also find me on

Document Location

Worldwide

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power ->PowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
14 January 2021

UID

ibm11111065