The C code file "extra.c" is used in two ways.
In this example program, there is simple code to check that two functions (extra_init() and extra_data()) work correctly in a "stand-alone" test environment. This code is optionally compiled in for testing only by adding the compile option "-D EXTRA_TEST". When the code is compiled into njmon or nimon, the testing code is excluded by not have "-D EXTRA_TEST" as a compiler option.
In njmon and nimon code, the new code by compiling in using an extra compiler option "-D EXTRA".
Here is the example code in a function called "extra.c" - using this file name is mandatory.
/* njmon / nimon -- internal data collector */ #ifdef EXTRA_TEST #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> void psection(char *s) { printf("\"%s\": {\n",s); } void psectionend() { printf("}\n"); } void pstring(char *name, char *s) { printf("\t\"%s\": \"%s\",\n",name,s); } void plong(char *name, long value) { printf("\t\"%s\": %ld,\n",name,value); } void pdouble(char *name, double value) { printf("\t\"%s\": %.3lf,\n",name,value); } extern void extra_init(); extern void extra_data(double elapsed); int main() { extra_init(); extra_data(2.0); return 0; } #endif /* EXTRA_TEST */ void extra_init() { /* If necessary, use this function to initialise any data structures */ } void extra_data(double elapsed) { FILE * pop; char string[4096]; double rperf = 0.0; if ( (pop = popen("/home/nag/rperf/rperf 2>/dev/null", "r") ) != NULL ) { if ( fgets(string, 4095, pop) != NULL) { /* Sample result->54.85 rPerf estimated based on 2.00 Virtual CPU cores<- */ sscanf(string, "%lf rPerf", &rperf ); string[strlen(string)-1] = 0; /* remove newline at the end */ } pclose(pop); } if(rperf > 0.0) { /* If the command failed, dont send the data */ psection("extra"); pdouble("rperf",rperf); pstring("rperf_string",string); plong("meaning_of_life", 42); psectionend(); } }
Comments on the extra.c code:
- Everything between "#ifdef EXTRA_TEST" and "#endif /* EXTRA_TEST */" is the code used for stand-alone testing of your two new functions.
- These lines are compiled in by adding the -D EXTRA_TEST option to the compiler.
- In this example case, we do not need the extra_init() function so it is empty but must be present.
- The extra_data() function has a parameter, which is the double floating point number called "elapsed". Elapsed is the number of seconds since this function was called (or the extra_init() function on the first time). In this example case, it is not used. If your data is an incrementing measure, then elapsed time is used to convert the statistics into measure per second. An example would be a data rate. If the previous value was 100 KB and current value is 120 KB, then the data rate would be "(120 - 100)/elapsed KB per second."
- This example, uses the popen() function to run the Korn shell script "rperf" from the directory "/home/nag/rperf/". There is a large assumption here that every target server has the rperf command in that directory. If I was rolling out this new feature across servers, I would probably put the rperf script in /usr/lbin (AIX) or /usr/local/bin (Linux).
- The comment in the code shows the output of the script: 54.85 rPerf estimated based on 2.00 Virtual CPU cores
- The sscanf() function captures the 54.85 into the double variable rperf.
- Next, we strip off the newline character - YOU CAN'T HAVE ANY CONTROL CHARACTERS IN STRINGS in JSON nor InfluxDB Line Protocol. The newline would mess up the test mode output. The full njmon or nimon program removes control characters for us.
- The pclose() function cleans up the open file. Not cleaning up, would create a memory leak.
- We now have a double variable called rperf and a string variable called rperf_string to save to the Time-Series database. In a real working case, the string variable is rather pointless as you cannot graph a string -that is only numbers.
- Finally, check the data is good. If the extra functions fail to read the data, do not attempt to save the data. Missing data is handled well in Time-Series databases.
- In the example, we use a section name of "extra" and save a double, string and "fake" integer just as an illustration.
You can section any name that is not already in use. Select a name to make the data content obvious. Perhaps the example would be better to have "estimated_rperf". Other examples: "rdbms", or RDBMS vendors name or "payroll_statistics " or "app_transaction_rates".
- The function psectionend() informs njmon or nimon that to end the data called "extra" (or whatever you want to call it).
Compile for Testing
Run the command:
$ cc extra.c -o extra -D EXTRA_TEST
Run the new program called "extra":
$ ./extra "extra": { "rperf": 54.850, "rperf_string": "54.85 rPerf estimated based on 2.00 Virtual CPU cores", "meaning_of_life": 42, } $
Notes:
- This example output is badly formed JSON. The final comma (",") after the 42 is not allowed. Do not worry about this extra comma as it is due to the simplistic test code. The real njmon or nimon code strips out the comma from the output buffer - that is the prime point psectionend() function.
- If you remove that comma, you could prove it is valid JSON data by reading it in to a Python program and converting it to a Python dictionary. There are other ways to test for a correct JSON format.
Compile in to njmon or nimon
The Makefile for the current AIX versions uses a command like this to compile:
cc njmon_aix_v63.c -o nimon_aix722_v63 -D NIMON -g -O3 -lperfstat -lm -qstrict cc njmon_aix_v63.c -o njmon_aix722_v63 -D NJMON -g -O3 -lperfstat -lm -qstrict
Change to:
cc njmon_aix_v63.c -o nimon_aix722_v63 -D NIMON -g -O3 -lperfstat -lm -qstrict -D EXTRA cc njmon_aix_v63.c -o njmon_aix722_v63 -D NJMON -g -O3 -lperfstat -lm -qstrict -D EXTRA
The extra.c file in the same directory.
Make a similar change to your Makefile or run the commands by-hand to compile your new functions in to njmon and nimon.
How does your code get added to njmon and nimon?
You don't need to understand this bit but it might help you work out what is going on.
The njmon and nimon code uses the "-D EXTRA" to include your new extra.c code file and call the new functions as follows.
To load the extra.c function in to the njmon or nimon code:
#ifdef EXTRA #include "extra.c" #endif /* EXTRA */
To call the extra_init() function before the main loop:
#ifdef EXTRA extra_init(); #endif /* EXTRA */
To call the extra_data(elapsed) function toward the end of the main loop. It is the last statistics to be added:
#ifdef EXTRA extra_data(elapsed); #endif /* EXTRA */
Testing the new code worked
Due to the njmon outputted JSON data records being all on one line, the files are hard to edit (unless you use line2pretty.py Python code to convert the format). It is simpler to use nimon for testing and the output file is easy to edit. Warning the -f created the output file in your current working directory and ends with "influxdblp":
./nimon_aix722_v63 -s1 -c1 -f
Wait five seconds.
Then, check the end of the new "influxdblp" file (in my case the filename is "blue_20200511_2236.influxlp") as follows:
$ tail -1 blue_20200511_2236.influxlp extra,host=blue,os=AIX,architecture=POWER8_COMPAT_mode,serial_no=7804930,mtm=IBM-9009-42A rperf=54.850,rperf_string="54.85 rPerf estimated based on 2.00 Virtual CPU cores",meaning_of_life=42i $
Note:
- The last line starts with "extra" or whatever name you used in psection()
- Next, are the tags like "host=blue" - ignore the tags. Tags make your statistics easier to find in the Time-Series database.
- After the space character is the actual data:
rperf=54.850,rperf_string="54.85 rPerf estimated based on 2.00 Virtual CPU cores",meaning_of_life=42i
- The ending "i" is due it being an integer.
If you have similar results, then your code is working. Well done.
The njmon and nimon programs are for the same source code except for outputting to JSON or InfluxDB line protocol.
If nimon works correctly, then so does njmon.
Now run the new njmon or nimon so the data gets to InfluxDB for 10 quick snapshots (-c 10 -s 10)
Checking your new statistic arrives in the database
After two or three snapshot periods, on the InfluxDB server run command line "influx" program and type the following commands.
Assuming your InfluxDB is called "njmon" and the psection() name was "extra".
# influx Connected to http://localhost:8086 version 1.7.7 InfluxDB shell version: 1.7.7 > use njmon Using database njmon > select * from extra name: extra time architecture host meaning_of_life mtm os rperf rperf_string serial_no ---- ------------ ---- --------------- --- -- ----- ------------ --------- 1589225676653374913 POWER8_COMPAT_mode blue 42 IBM-9009-42A AIX 54.85 54.85 rPerf estimated based on 2.00 Virtual CPU cores 7804930 1589225687042766744 POWER8_COMPAT_mode blue 42 IBM-9009-42A AIX 54.85 54.85 rPerf estimated based on 2.00 Virtual CPU cores 7804930
Alternatively, directly select your new statistic columns:
> select rperf,meaning_of_life,rperf_string from extra name: extra time rperf meaning_of_life rperf_string ---- ----- --------------- ------------ 1589225676653374913 54.85 42 54.85 rPerf estimated based on 2.00 Virtual CPU cores 1589225687042766744 54.85 42 54.85 rPerf estimated based on 2.00 Virtual CPU cores 1589225697429213338 54.85 42 54.85 rPerf estimated based on 2.00 Virtual CPU cores 1589225707969159256 54.85 42 54.85 rPerf estimated based on 2.00 Virtual CPU cores 1589225717355288209 54.85 42 54.85 rPerf estimated based on 2.00 Virtual CPU cores
If your new statistics are in the InfluxDB. Well done.
Now implement your new njmon or nimon into production
If you rely on a program or script to get the statistics, then make sure every server has that program or script in the same directory.
Now run your new njmon or nimon for real. Wait an hour (so that a graph can be drawn) and use Grafana to graph your new statistics.
In this worked example, the rperf number does not change unless the LPAR (VM) size is dynamically changed.
Example of my server Estimated rPerf changing due to the dynamic changes to Virtual Processor count for the LPAR (VM) via the HMC.
- The upper left shows the current value in a "Single Stat" panel.
- The middle graph shows the rPerf values.
- The lower graph shows the Virtual and CPU consumed with real work.
Last thing to do is email to let me know you have it working and statistics. I might add any statistics that are generally useful for all njmon and nimon users for the next release. Your hard work gets acknowledge in the release notes.