If you configure Cloud Pak for Data to send
user activity logs to an RSYSLOG
server, you can set up a simple Python script on the RSYSLOG server to add descriptions to the
logs.
Before you begin
Your RSYSLOG server must be
configured to print out logs as a file.
About this task
This topic includes a sample script that you can adapt to your needs.
The script calls the cpd-cli
manage
gateway-context-array command. The command generates a file that
contains a list of descriptions that the python script can map to each route in the log.
If you want to use the script as-is, ensure that the cpd-cli is
installed on the RSYSLOG server. You
can either modify the script to call the cpd-cli in the directory where
it is installed or make the cpd-cli executable from any directory. For
more information, see Installing the IBM Cloud Pak for Data command-line interface.
Procedure
- Create a python script with the following contents:
import json
import re
import subprocess
import sys
if len(sys.argv) <3:
print ("Provide the input and output filenames")
exit()
fileinput = sys.argv[1]
fileoutput = sys.argv[2]
# get JSON rule file from cpd-cli command
f = subprocess.run(['cpd-cli', 'manage', 'gateway-context-array'], stdout = subprocess.PIPE).stdout.decode('utf-8')
# extract the JSON part of rule
fstart= re.search(r'\[\r\n+(.*)',f)
fend= re.search(r'\r\n\]',f)
filedata = f[fstart.start():fend.end()]
# return JSON object
ruledata = eval(filedata)
for index, rule in enumerate(ruledata):
# validate the regex rule
try:
re.compile(rule['regex'])
except re.error as e:
print('rule '+rule['regex']+' is invalid because')
print(e)
del ruledata[index]
with open(fileinput,'r') as finput:
with open(fileoutput, "w") as foutput:
# check if json format
for line_num, line in enumerate(finput):
if 'cpd_nginx' in line:
json_matched = re.search(r'[{]+(.*)+[}]', line)
if(json_matched):
# print the first matched search
# print(json_matched.group(0))
# convert the matched string to json object
line_obj=json.loads(json_matched.group(0))
route = line_obj['request'].split()[1]
#print(route)
# write the prefix tag to file
start_pos = json_matched.start()
line_obj['log_from'] = line[0:start_pos-2]
# begin the match
for rule in ruledata:
if (re.compile(rule['regex'])):
#print (rule['regex'])
matchedrule = re.match(rule['regex'],route)
if (matchedrule):
# match the first rule
break
else:
print (re.error)
if (matchedrule):
line_obj['component'] = rule['details'][0]
line_obj['description'] = rule['details'][1]
foutput.write(json.dumps(line_obj)+"\n")
else:
line_obj['details'] = "NOT MATCHED"
foutput.write(json.dumps(line_obj)+"\n")
else:
print (line_num, "not found http_request in", line)
with open(fileoutput, "r") as fcheck:
for line_num, line in enumerate(fcheck):
#check if each line is json format
line_obj=json.loads(line)
#print(line_num,line_obj)
print(line_obj)
if(not json.loads(line)):
print("Error")
- When you call the script, specify the name of the log file that you want to add the
mappings to and the name that you want to use for the output.
For
example:
python <my-script-file>.py <fully-qualified-input-file-name> <fully-qualified-output-file-name>
Results
The script appends a component name and a description to each entry in the user activity
log. For example, in the following log entry, the script added entries that are tagged with
1 and 2 .{
'http_referrer': 'https://my-cpd-instance.com/auth/login/sso?logged_out=true',
'request': 'GET /zen-login/logo.png HTTP/1.1',
'http_sec-fetch-dest': 'image',
'http_x_forwarded_for': '192.0.2.0',
'time': '2023-09-01T16:36:36+00:00',
'remote_addr': '192.0.2.255',
'upstream_addr': '',
'log_from': 'Sep 1 16:36:36 ibm-nginx-c57fbfc98-lhskk cpd_nginx',
1 'component': 'zen',
2 'description': 'Zen login page image'
}