IBM Support

Stress test your AIX or Linux server with nstress

How To


Summary

The "nstress" package is a set of programs to keep many parts of the computer busy including CPU, memory, file systems, inter-process communications, and disks.

Objective

Nigels Banner

Possible uses of the programs in the nstress toolbox are:

  • You can "burn in" new hardware to prove it is reliable before production use

  • You can find out how fast your computer runs like memory speeds or disk I/O

  • You can generate "fake" workloads and then use performance monitoring tools like nmon or njmon to see the performance stats in action.

Environment

If you over do the generated workload, then make sure there is no other users on your computer (or VM) because nstress can grind your computer to a stop.
man sweating over his computer
The author focuses on POWER processor-based computers but these tools are additionally compiled for AMD64 (x86_64)

Steps

There are a number of uses:

  • Soak testing = check a new machine/disk to remove early life failures
  • Prove performance of machine upgrades or alternative disk configurations
  • Learn performance monitoring and tuning
  • For example, I run a Performance Tuning Expert Class and need to quick setup many different workloads and problems to be solved. With a 20 line shell script and these tools, I don't have to spend a week of setup time.

 The programs are: 

Name Purpose
ncpu Hammers the CPUs (can be slowed down to use a percentage)
ndisk Removed - use ndisk64
ndisk64 Hammers the disks compiled for large files so it can access large files (many GBs)
ndiskaio Removed - use ndisk64
ndiskmio Removed. Uses the Modular IO AIX Expansion pack library must be installed (experimental not currently available)
nmem Hammers or touches memory
nmem64 Hammers or touches memory - complied 64 bit so it has access large memory (many GBs)
nipc Tests shared memory, semaphores, and shared messages in a ring of processes - takes 1 CPU
nlog Generates output like error messages. You specify date in kilobytes (KB) output per second
nfile Creates, writes, and deletes files to push the JFS and JFS log hard
createfs.sh Example script to create the file systems used by the scripts - you need to edit the file for your system
dbstart.sh Example script to start a fake database RDBMS - you need to edit the file for your system
webstart.sh Example script to start a fake web server - you need to edit the file for your system

Download the programs:

  • These files contains all the binary command files and scripts.
  • AIX Tools:
  • Linux for computers that use the POWER processor tools:
  • Linux on AMD64 (called x86_64 by Intel) tools:
  • Warnings:
    • Do not consider these files as benchmark programs - they are hardware stressing tools.
    • Do not compare AIX with Linux - it is different code and different compilers. Especially ndisk64.
    • Do not compare Linux ppc64 and Linux x86_64 - I did not install the Advanced Tool Chain on ppc64 to get the optimized compiler, which would give up to 35% better performance.
    • Do not compare Linux with Linux - due to the different ages of the OS, different kernel levels, different libc, different GCC it is NOT a fair comparison. Note some Distributions were NOT updated from current Online repositories and are the original Gold DVD level RPMs (RHEL and SLES) as I don't have repository access.
    • Special ndisk64 version 75 notes:
      • Always use the -M procs option to specify the number of processes to use or it hangs.
      • Recommended: using the -s size option to specify the file size
      • The Async I/O is completely missing in the code - don't believe the -? help.
      • The command-line option -C for CIO and -D for DIO are not available. 
        Use Direct I/O by using the /dev/rsda9 type volume but make sure it is NOT a file system or it trashes your files in a nanosecond.
  • The Source Code is not available to the public.

Warranty = none

  • It is strictly at your own risk.
  • It is possible to hang the entire server.
  • Use the option that stops the tool in say 5 minutes otherwise you need to halt your server or VM the hard way.
  • If you run these programs as a regular user, then no harm can be to your system.
  • If you run these programs as the root user, they can be dangerous and even hang the machine due to total saturation of CPU or memory or disk I/O.

Warning

  • Note: ncpu running as the root user tries to boost its priority.
  • This CPU priority boost effectively locks out an entire CPU. Which can be a good option to have.
  • This tool effectively removes the CPU from your configuration. Note: Since 2010, PowerVM dynamic LPAR changes (DLPAR) are a safer way.

Additional Information

Note: Most commands now have the following options

Rename yourself option: -o

  • This option allows the process to rename itself so that it looks like something else when it is running.
  • For example, a database. * This HAS to be the LAST option on the line.
  • These tools fool the ps and nmon programs.
  • I call the masquerading.

Snooze mode: -z

  • This option causes the program to not run flat out and so behave like a user driven command.
  • This snoozing is performed at the millisecond level and is quickly effective.

Second to run and then stop: -s

  • This option specifies the maximum time in second to run.
  • ALWAYS use this option to make sure the programs stop, otherwise you can slow down a machine forever (or reboot). If you "over cook" the workload by starting too many programs and the machine stops responding. This option is better than reboot the server or virtual machine.

Manual pages = the help output of the programs

ncpu -h output
Usage: ncpu version 9.0 hammers the cpu(s)
Note: root users get a priority boost = effectively removes the CPU(s)

Hint hammer CPU mode: ncpu -p procs [-z percent] [-s secs] [-h secs] [-o "cmd"]
        -p procs   = number of copies of cpu to start (max=256)
        -z percent = Snooze percent - time spent sleeping (default 0)
        -s seconds = Seconds maximum run time (default no limit)
        -h seconds = Seconds to sleep after each second of run time
        -o "cmd" = Other command - pretend to be this other cmd when running
                Must be the last option on the line

Hints

Run in CPU counter mode:  

ncpu -c
Run on eight CPUs but not use the last 20% of the CPUS, stop after 15 minutes and add jitter and the ps command reports the processes as DB2
ncpu -p 8 -z 20 -s 900 -h1 -o DB2inst

ndisk64 Help Output

ndisk64 version 7.4
Complex Disk tests - sequential or random read and write mixture
ndisk64 -S          Sequential Disk I/O test (file or raw device)
        -R          Random    Disk I/O test (file or raw device)

        -t <secs>   Timed duration of the test in seconds (default 5)

        -f <file>   use "File" for disk I/O (can be a file or raw device)
        -f <list>   list of filenames to use (max 16) [separators :,+]
                        example: -f f1,f2,f3  or -f /dev/rlv1:/dev/rlv2
        -F <file>   <file> contains list of filenames, one per line (upto 2047 files)
        -M <num>    Multiple processes used to generate I/O
        -s <size>   file Size, use with K, M or G (mandatory for raw device)
                        examples: -s 1024K   or   -s 256M   or   -s 4G
                        The default is 32MB
        -r <read%> Read percent min=0,max=100 (default 80 =80%read+20%write)
                        example -r 50 (-r 0 = write only, -r 100 = read only)
        -b <size>   Block size, use with K, M or G (default 4KB)
        -O <size>   first byte offset use with K, M or G (times by proc#)
        -b <list>   or use a colon separated list of block sizes (31 max sizes)
                        example -b 512:1k:2K:8k:1M:2m
        -q          flush file to disk after each write (fsync())
        -Q          flush file to disk via open() O_SYNC flag
        -i <MB>     Use shared memory for I/O MB is the size(max=256 MB)
        -v          Verbose mode = gives extra stats but slower
        -l          Logging disk I/O mode = see *.log but slower still
        -o "cmd"  Other command - pretend to be this other cmd when running
                        Must be the last option on the line
        -K num      Shared memory key (default 0xdeadbeef) allows multiple programs
                    Note: if you kill ndisk,  you may have a shared memory
                    segment left over. Use ipcs and then ipcrm to remove it.
        -p          Pure = each Sequential thread does read or write not both
        -P file     Pure with separate file for writers
        -C          Open files in Concurrent I/O mode O_CIO
        -D          Open files in Direct I/O mode O_DIRECT
        -z percent  Snooze percent - time spent sleeping (default 0)
To make a file use dd, for 8 GB: dd if=/dev/zero of=myfile bs=1M count=8196

Asynchronous I/O tests (AIO) This now uses POSIX threads and  Async I/O API
        -A         switch on Async I/O use: -S/-R, -f/-F and -r, -M, -s, -b, -C, -D to determine I/O types
                (JFS file or raw device)
        -x <min>   minimum outstanding Async I/Os (default=1, min=1 and min<max)
        -X <max>   maximum outstanding Async I/Os (default=8, max=1024)
        see above -f <file>  -s <size>   -R <read%>  -b <size>

For example:
        dd if=/dev/zero of=bigfile bs=1m count=1024
        ndisk64 -f bigfile -S -r100 -b 4096:8k:64k:1m -t 600
        ndisk64 -f bigfile -R -r75 -b 4096:8k:64k:1m -q
        ndisk64 -F filelist -R -r75 -b 4096:8k:64k:1m -M 16
        ndisk64 -F filelist -R -r75 -b 4096:8k:64k:1m -M 16 -l -v


For example:
        ndisk64 for Asynch compiled in version
        ndisk64 -A -F filelist -R -r50 -b 4096:8k:64k:1m -M 16 -x 8 -X 64


nmem64 Help Output

$ ./nmem64
Usage: nmem version 6.1
Hint: nstress tool to hammer memory (do not use this on production machines)
 use 1: mallocs memory and then touches the memory pages at random
 use 2: cycles though memory speed test to determine/guess cache sizes (-c option)
 output includes memory size used, operations performed, time taken and ops per second

        nmem -m Mbytes [-s MaxSeconds] [-z percent][-o "cmd"]
                Mbytes     = Size of RAM to use in mega-bytes
                             For nmem the max is 255 (~256MB) and for nmem64 the max is 2047 (~2GB))
                MaxSeconds = maximum time of the test.
                        Use this to halt nmem even if you drove your OS to a stand still.
                        You have been warned!
                percent    = Snooze percent - nmem sleeps for the percentage of the time
                cmd        = nmem with pretend to be a process called cmd.
                             This option must be the last on the line in double quotes


        Memory speed test with increasing memory sizes
        - may highlight CPU cache sizes

        nmem -c [-s MaxSeconds]

                MaxSeconds = maximum time of the test (default 60)
Example:
        nmem -m 250 -s 300               - grab and randomly touch 256 MB of memory for 5 minutes
        nmem -m 250 -s 300 -z 80         - as above but slower, sleep 80% of the time
        nmem -c -s 600                   - cycle through tests (maximum of 10 minutes)
        nmem -m 6 -o "sally -x"          - Use 6 MB pretend to be process sally with parameter -x

If your OS complains about the size of memory use you hit your ulimit
         For 256MB+ try: ulimit -d unlimited


nlog Help Output

./nlog -h
Usage: nlog version 3.0
Hint: generates a log file at a steady rate to standard error
        nlog -k Kbytes -s Seconds -m MaxSeconds -o "cmd"
        nlog -k 1 -s 60                 is the default
Example:
        nlog -k 2 -s 1 -m 600            - 2 K per second for 10 minutes
        nlog -k 2 -s 1                   - 2 K per second forever
        nlog -k 2 -s 1 -o "bert -x"      - 2 K per second pretend to be process bert with parameter -x
        nlog -k 1 -s 60 -m 300           - 1 K per minute for 5 minutes
        nlog -k 200 -s 1 -m 3600         - 200 K per second for an hour


nipc Help Output

This tool runs processes that communicate with shared memory with control using a UNIX IPC semaphore and then back again with a message queue.

Usage: nipc version 2.0
nipc: hammers inter-process communication (IPC) 
    that is shared memory, semaphores and message queues

hint: nipc -p procs -s seconds

nfile Help Output

Usage: nfile version 2.0
Hint: creates and deletes files = generates JFS log file work
nfile -d directory [-k Kbytes] [-c Files] [-m MaxSeconds] [-z percent] [-o "cmd"]
        -d directory  - top level directory
        -k Kbytes     - size of the files to create in KB (default 4KB)
        -c Files      - number of files to maintain (+/- 10) (default 4096)
        -m MaxSeconds - stop of so many seconds
        -z percent    - percent of time to sleep/snooze
        -o "cmd"    - pretend to be a different cmd (must be the last option)
Example:
        nfile -d mydir -k 1 -c 10000 -m 600
        nfile -d /tmp/files -k 64 -o "bert -x"

If you find errors or have question, email me: 

  • Subject: nstress
  • Email: n a g @ u k . i b m . c o m  

Find me on

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power ->PowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
10 September 2021

UID

ibm11111065