IBM Support

"Started At" field changes based on time frame selected even if Process ID remains the same

Troubleshooting


Problem

A strange scenario is noticed.
It can be a "working as designed" issue but still it is at least misleading.
There is a process that is started on Jan 9th:
[usr@hostX ~]$ TZ=Asia/Tokyo; ps -eo lstart,pid,args
STARTED                    PID       COMMAND ....
Mon Jan 9 17:33:50 2023    18611     processX -port 30007

Basically the problem is that the "Started At" time is changing based on the time frame chosen in the time picker.

As an example you can select the last 7 days for the processX on hostX and you get
Started At 2023-06-12, 00:48:04 (4d 13h 59m)   
but if you select a time frame from Jan 1st 2023 to Feb 1st 2023 then you get the correct
Started At 2023-01-09, 09:33:50 (5mo 7d 5h 14m)
(Instana UI adjusted to local time zone)

From the Instana UI, the snapshot ID is always the same and the process ID is the same too.
What's the problem here?

Cause

There's a difference in what the ps command outputs and what the agent reads as "Started At" on certain systems.

Resolving The Problem

Below you can find the result of the investigation.

The ps command gets the information about a process by looking into the /proc/[pid]/stat file (this is a virtual file system, it's not "real").

The file contains many information about a process, but we're interested in the field starttime that is in position 22, and it is defined as "The time the process started after system boot". You can check the man page

What our agent does, is a bit different:

  • If /proc path is available (and it is on Unix based systems), the agent tries to read metadata information of the /proc/[pid] directory. Then, it stores the information as a UnixAsBasicFileAttributes object, which then has access to creationTime, lastModifiedTime, and lastAccessTime fields. We can mimic this by doing stat /proc/[pid].
  • If creationTime or “birth time” is not supported, it returns lastModifiedTime instead.
  • birth time is only available on certain file systems and they all have a different name for it, examples: XFS v5 → crtime, ufs2 → st_birthtime, zfs → crtime, ext4 → crtime, btrfs → otime, jfs → di_otime
  • On RHEL birth time is not available, if we do stat /proc/[pid] (or stat -c ‘%W’ /proc/[pid]) we get 0 (or unknown), and in this case the agent will return the lastModifiedTime.
  • In this case (when /proc path is available) SIGAR is never actually used, it’s all done by the agent.
     

So what we suspect we're seeing here as "Started At" is the lastModifiedTime of that process directory /proc/18611 (because creationTime (birth time) is not available), which differs from the output of ps -eo lstart,pid,args.

Something happened to the process that caused it to change its modified time; for example, events that change the lastModifiedTime include adding, removing, or renaming files or subdirectories in /proc/18611. Changes to this dir are done by the kernel and we can't say what caused them.


To confirm all of this, you can run
stat /proc/18611
and compare the Modify field to our "Started At 2023-06-12, 00:48:04 (4d 13h 59m)"  and verify they are the same.

The output should look something like this:

Access: 2023-07-20 08:43:17.709922831 +0000 Modify: 2023-07-20 08:43:17.709922831 +0000 Change: 2023-07-20 08:43:17.709922831 +0000 Birth: -

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSE1JP5","label":"IBM Instana Observability"},"ARM Category":[{"code":"a8m3p000000UoUGAA0","label":"Agent"}],"ARM Case Number":"TS013172527","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Product Synonym

Instana

Document Information

Modified date:
04 August 2023

UID

ibm17020393