By popular demand, I have been busy working on this in the background.
As you know nmon is not my day job.
First, is a tool to take an nmon output file (.nmon which is a Comma Separated Value text file) and convert it into JSON format. JSON is the preferred format for many "new age" web 2.0 tools to ingest into a database and/or support dynamic graphing of the stats. JSON is a simple format but there are options for grouping stats.
I would like your comments on that and what is the best fit for you and your favourite tools.
Simple all at one level:
. . .
Or grouped together
. . .
nmon2json: New Flash 14th Dec 2017
This was released today at the below website. You will find there the nmon2json Ksh script and sample output files for both formats (single level and multiple level JSON) and from the output from originating from both AIX and Linux.
- The single level output option is now the default - I am told this is the way Splunk, ELK and Logstash like the data.
- Also find the Python program as an example on how to read JSON data files and extract the stats from the nmon2json script.
Limitations in the first release:
- Only does totals for Network I/O (just total KB/ per second and packets per seconds) - not individual network interfaces
- Only does totals for Disk I/O (just totals for read and write KB per second and xfers per second) - not individual disks
- Does not handle Top process stats - as its complicated and processes are transitory which makes parsing the JSON very hard.
- Can only be used on finished nmon files as it takes multiple passes due to the config data in the "info" section.
- This means you can't parse the live output from nmon (for example via a pipe or FIFO) for live stats collection.
- It turns out nmon for AIX can't be streamed to a pip or FIFO command - it reports the file exists already and stops.
njmon for Linux and njmon for AIX
Second, is a new tool that extracts the performance stats like nmon but then generates JSON data directly. This is currently only covering Linux with lots of data from the /proc filesystem, some from system admin command of the old UNIX compatibility library functions supported by Linux.
This is actually very quick and collects ~330 different stats so far.
Currently, about 1100 lines of simple C code.
As it collects the raw Linux stats you can check the Linux documentation ( Hmmm!! ) or the web for explanations of the numbers.
If you have lots of CPUs, networks and disks that number grows rapidly. My POWER8 S822LC Supermicro Briggs has 20 CPU cores and SMT=8 resulting in 160 logical CPUs - each has a dozen stats. If it collected on a PowerVM LPAR there is a further 50 stats.
Limitations in the first release:
- Only Linux on any platform (currently checked on POWER8 (Ubuntu and RHEL), ARM (running Ubuntu) and AMD64 also called x86_64) and only recent releases (guessing the last 3 years) - Not going to initially backport or test on the old Linux Distro versions.
- Removed: Could do with some optimising like caching static data and using file rewind() rather than open/close file function.
- Removed: Some of the /proc stats are incrementing counters (unfortunately the Linux documentation does not point these out) that are reported "as-is" typically numbers in the trillions. To work out a data rate you need to take the difference between two snapshots of the data, taking the difference and dividing by the elapsed time. This is coded up for CPU stats but not yet for networks and disks can be added.made a little complex but of the high numbers of possible networks and disks and filtering out the rubbish Linux reports in /proc.
- Does not yet capture Top Processes, NFS or configuration stats like the commands lsblk and lshw output.
- Want to add two features to send the data directly to a remote socket and port - - or - - to run as a website to other tools can connect and pull the data.
- AIX version: Something similar could be done for AIX as it has a far better than Linux performance stats library: libperfstat.
Here is a sample output for comment: njmon_for_Linux_v11_sample.json
A call for help:
- JSON format means much larger output file - if we started nmon on JSON, the data volume and processing would have killed nmon (using to much CPU) and the crashed the spreadsheets at the time. Is there a case for much smaller output like 20 key stats or is the volume acceptable today?
- Anyone an expert in data collectors for Splunk or ELK ( Logstash)?
Calling SPLUNK data provider developers!
Now developing an AIX live all performance stats from perfstat library straight to JSON format tool in C. Thus being more comprehensive and avoiding the nmon format step and having to do further processing. It will also be "live data" rather than at the end of the day. Help needed:
- What JSON style is best for Splunk to accept? Splunk guys suggested keep it simple.
- Should it just be logged to file to be collected using other mechanisms? Splunk guys suggested log to file and Splunk can collect it - this is standard practice.
- Or Get Splunk to query a REST API (if so is there an example)? Perhaps, in the future.
- Is there a worked example in C Splunk data provider? Perhaps, in the future.
- Does anyone want to run a joint project? = I get the data (I have 20 years experience on that side) and a Splunk guru writes the Splunk App side (I know nothing)?
Reading the Splunk documentation is very hard work as it covers 1000's of options and possibilities and 100's of "use cases" and languages. like climbing a mountain where a ladder would do the job.
Update : Corresponded with Splunk staff and they say the nmon2json will already work OK.
I already have the njmon for Linux collecting and outputting JSON - it too needs to be made Spunk friendly.
Beta code is available for testers then we will go open source, once stable.
Cheers, Nigel Griffiths