IBM Support

Diagnosing Cron and At Command Problems

Question & Answer


Question

I'm having a problem with cron or at jobs. How can I find out what's wrong?

Answer

First Questions to Ask

1. Is only a single user having a problem, or is this common across all users?

2. Does it work for the root user vs normal users?

3. Is there a specific entry in the user's crontab file that is failing?

4. Did this work before? If so what has changed?

5. Is this only a problem at certain times of the day or year? (for example DST change)


Items to check when diagnosing a cron or at problem

1. Check that cron has the correct permissions and ownership. Note that cron is a SETUID root binary:
# ls -l /usr/sbin/cron
-r-s--S---    1 root     cron          77152 Jul 03 12:10 /usr/sbin/cron

2. Check to see that cron is still running using ps -ef@
# ps -ef@ | grep cron | grep -v grep | grep Global
  Global     root  3539064        1   0   Jul 20      -  0:48 /usr/sbin/cron

3. If the cron daemon is not running check that cron has a "respawn" entry in the /etc/inittab file:
cron:23456789:respawn:/usr/sbin/cron
** Note:  If cron is not running, but attempts to start cron result in /usr/sbin/cron  "cron is already running" message: 
#  ls -al /etc/locks/LCK..cron (/etc/locks is a link to /var/locks)
      If this file exists:
# rm /etc/locks/LCK..cron
# telinit q

4. If a specific user or set of users cannot use crontab or cron, check that the 'daemon' attribute is set to "true" for them:
# lsuser -a daemon mike
mike daemon=false

In this instance a user will see this error when trying to use crontab:
$ crontab -e
crontab: you are not authorized to use cron.  Sorry.

Changing the user's 'daemon' attribute will fix the problem:
# chuser -a daemon=true mike

5. Get a zsnap which will include cron information. If you cannot get a zsnap then at least gather these files or output:

In the /var/adm/cron directory
*allow files (if present)
*.deny files (if present)
log (cron log)
queuedefs

/etc/cronlog.conf
/var/spool/cron/crontabs/user (specific user's crontab)
/var/spool/mail/user (specific user's mailbox)

The download link for the zsnap tool is at the bottom of this technote in the References section. To gather the cron and user information we suggest using:
# zsnap --CRON --SECURITY --pmr pmr.br.co
where pmr.br.co is the full PMR number for the problem.

6. Check the user's mailbox.
Any cron job that produces output either to STDOUT or STDERR will cause a mail message with that information to the customer.

   Date: Mon, 4 Oct 2010 13:15:05 -0700
   From: root
   To: you
   Subject: Output from cron job date, user@hostname, exit status 0

   Cron Environment:
    SHELL = /usr/bin/sh
    PATH=/usr/bin:/etc:/usr/sbin:/usr/ucb:/usr/bin/X11:/sbin:/usr/java5/jre/bin/usr/java5/bin
    CRONDIR=/var/spool/cron/crontabs
    ATDIR=/var/spool/cron/atjobs
    LOGNAME=user
    HOME=/home/user

   Your "cron" job executed on hostname on Mon Oct  4 13:15:00 PDT 2010

   date

   produced the following output:

   Mon Oct  4 13:15:04 PDT 2010

   *****************************************************************
           cron: The previous message is the standard output
           and standard error of one of the cron commands.

A few things to note here:
1. The exit status of 0 means the job completed with no errors.
2. The time and date the job was executed.
3. The command that was run
4. The environment set up for the job.
5. The actual output of the job.

If cron schedules a job, but the job errors, then cron is most likely working fine, and the customer should investigate why that job did not run in the enviornment cron has set up for it.

7. Compare this email output with the cron log file to see what it says about the job.
user   : CMD ( date ) : PID ( 426004 ) : Mon Oct  4 13:15:00 2010
Cron Job with pid: 426004 Successful

This gives us the following information:
1. The user the job was run as.
2. The PID that was forked off of cron.
3. The command run with arguments.
4. The date and time it was run
5. Whether or not the job started successfully.

If your customer sees the job kicked off successfully, but has STDERR and STDOUT redirected to /dev/null, have them change that crontab entry.

From:
   15 13 * * * date > /dev/null 2>&1
to:
   15 13 * * * date

That way STDOUT and STDERR will be emailed to them as seen above.

8. If cron is hung, try to get a stack trace of it.
On AIX 5.3 and up you can use /usr/bin/procstack to get a quick view of the current state of the process
# ps -ef@ | grep cron | grep -v grep | grep Global
    Global   root 258184      1   0   Oct 01      -  0:00 /usr/sbin/cron

# procstack 258184
258184: /usr/sbin/cron
0xd0383a34  read(??, ??, ??) + 0x1a8
0x10000c74  msg_wait() + 0xe8
0x10004240  idle(??, ??) + 0x4c
0x10004f50  main(??, ??) + 0x544
0x10000198  __start() + 0x98

9. Check to see if cron has forked off a child process that may be hung. This can easily be done with the /usr/bin/proctree command
# proctree -a 258184
1    /etc/init
   258184    /usr/sbin/cron

Adding the "-a" option allows you to see that init started cron from the inittab, and still lists it as a child process.

NOTE: both /usr/bin/procstack and /usr/bin/proctree are found in the bos.perf.protocols fileset.


Further Steps for At Command Issues

For the "at" command, which is also run via cron, more information can be gathered.

1. Was the at job scheduled or did it fail?
If it failed to schedule the user should have seen an error on the command-line STDOUT similar to:
   at: 0481-098 The specified date is not in the correct format.

Remember that the /usr/bin/at command takes STDIN as the command to be run. This syntax is incorrect:
$ at now +1 minute command

But this will work:
$ echo "command" | at now + 1 minute

A correct job scheduling should come back to the user with a message similar to:
Job user.1286233998.a will be run at Mon Oct  4 16:13:18 PDT 2010.


2. If the time has not come for the job to run, check the at queue using
$ atq
user.1286233998.a    Mon Oct  4 16:13:18 PDT 2010

3. You can also check the /var/spool/cron/atjobs directory for any jobs awaiting their run time:
# ls
user.1286234667.a

The job itself should include the environment for the job, plus the command to run:

# cat user.1286234667.a
REAL_USER=userLOGIN_USER=userREAL_GROUP=staffGROUPS=staff,SUADMINAUDIT_CLASES=general,tcpipRLIMIT_CPU=9223372036854775807RLIMIT_FSIZE=2097151RLIMIT_DATA=262144RLIMIT_STACK=65536RLIMIT_CORE=2097151RLIMIT_RSS=65536RLIMIT_NOFILE=2000RLIMIT_THREADS=9223372036854775807RLIMIT_NPROC=9223372036854775807RLIMIT_CPU_HARD=9223372036854775807RLIMIT_FSIZE_HARD=2097151RLIMIT_DATA_HARD=18014398509481984RLIMIT_STACK_HARD=8388608RLIMIT_CORE_HARD=18014398509481984RLIMIT_RSS_HARD=18014398509481984RLIMIT_NOFILE_HARD=9223372036854775807RLIMIT_THREADS_HARD=9223372036854775807RLIMIT_NPROC_HARD=9223372036854775807UMASK=22PAG_DATA=USRENVIRON:_=/usr/bin/atLANG=en_USLOGIN=userPATH=/usr/bin:/etc:/usr/sbin:/usr/ucb:/usr/bin/X11:/sbin:/usr/java5/jre/bin:/usr/java5/binLC__FASTMSG=trueLOGNAME=userMAIL=/usr/spool/mail/userLOCPATH=/usr/lib/nls/locUSER=userAUTHSTATE=compatSHELL=/usr/bin/kshODMDIR=/etc/objreposHOME=/home/userTERM=xtermMAILMSG=[YOU HAVE NEW MAIL]PWD=/home/userTZ=America/Los_AngelesA__z=!LOGNAMESYSENVIRON:LOGNAME=userNAME=userTTY=/dev/pts/3
umask 022
cd /home/user
oslevel

4. The user should see an email with any STDOUT or STDERR from the command similar to cron.

5. Check the permissions and ownership of the at command. Similar to cron, it is a SUID root binary:
# ls -l /usr/bin/at
-r-sr-sr-x    1 root     cron          56566 Jul 03 12:13 /usr/bin/at


References

The zsnap tool can be downloaded from the IBM Support web site:
http://www-01.ibm.com/support/docview.wss?uid=aixtools-2f5a8cf3

[{"Product":{"code":"SWG10","label":"AIX"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Miscellaneous","Platform":[{"code":"PF002","label":"AIX"}],"Version":"5.3;6.1;7.1","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
07 March 2019

UID

isg3T1012492