Troubleshooting
Problem
The machine hangs during boot, and there is no console login. A remote login via ssh or telnet is possible.
Symptom
The system does not boot to a console login
Cause
This problem is usually a result of init hanging while waiting for a process to complete.
The init process reads /etc/inittab from top to bottom. When init encounters an entry labeled "once", as seen in the following inittab line, it executes the script, and does not wait for process to complete.
The init process reads /etc/inittab from top to bottom. When init encounters an entry labeled "once", as seen in the following inittab line, it executes the script, and does not wait for process to complete.
mkatmpvc:2:once:/usr/sbin/mkatmpvc >/dev/console 2>&1
If an entry is set to "wait", as seen in the following inittab line, init executes it and does not continue until process has finished.
install_assist:2:wait:/usr/sbin/install_assist </dev/console >/dev/console 2>&1
If an entry is set to "respawn", init will not respawn the process until it completes all the other processes.
If an entry is set to "wait", as seen in the following inittab line, init executes it and does not continue until process has finished.
install_assist:2:wait:/usr/sbin/install_assist </dev/console >/dev/console 2>&1
If an entry is set to "respawn", init will not respawn the process until it completes all the other processes.
Resolving The Problem
Determine what is preventing the start of getty :
1) First, look at a normal init state of "et_wait".
#kdb
(0)> f 1
pvthread+000100 STACK:
[000540DC]et_wait+0002B0 (00000000D0128060, 200000000000D0B2,
00000000FFFFFFFF [??])
[004B024C]pause+000038 ()
[00003810].svc_instr+000110 ()
[D012805C]_p_pause+00007C ()
[10000C14]main+000884 (??, ??)
[10000198]__start+000098 ()
We can create an init hang by using a simple sleep script.
#vi /tmp/sleep.scr
#!/bin/ksh
echo "Getty will not start now."
sleep 3000
#vi /etc/inittab
Add the following line to inittab to simulate a hang.
hang:23456789:wait:/tmp/sleep.scr >/dev/console 2>&1
A) On a 64-bit system burritobso
===================================
burritobso:/:# bootinfo -K
64
burritobso:/:#cat /etc/inittab |grep sleep.scr
hang:23456789:wait:/tmp/sleep.scr >/dev/console 2>&1
I have added this line on purpose to /etc/inittab and rebooted system.
This should simulate what customers usually see when the getty process does not start.
I do not have a login prompt on console, lets see if we can find out what is causing the hang.
burritobso:/:#kdb
(0)> f 1
pvthread+000100 STACK:
[0013E3D8]_kwaitpid+00038C (00000000D03B6324, 200000000000D0B2,
00000000FFFFFFFF, 000000002FF218F0, 0000000000001003, 00000000510000F0,
0000000000000001 [??])
[0013F18C]kwaitpid+00007C (??, ??, ??, ??, ??)
[00003810].svc_instr+000110 ()
[D03B6320]waitpid+00016C (??, ??, ??)
[10002D74]waitproc+0000D4 (??)
[10002374]spawn+00017C ()
[10000ADC]main+00074C (??, ??)
[10000198]__start+000098 ()
We can see init stuck in _kwaitpid.
(0)> dc _kwaitpid 80|grep r4
._kwaitpid+0000A0 extsw r28,r4
._kwaitpid+0000B4 li r4,0
._kwaitpid+0000E0 srdi r4,r5,C4
._kwaitpid+0000E4 or r3,r3,r4
._kwaitpid+0000E8 clrlwi r4,r5,1F
._kwaitpid+0000EC or r3,r3,r4
._kwaitpid+000120 lis r4,1
The extsw line shows that register r4 has saved its value in r28. Now, we must determine the value of register r28 and get the PID.
(0)> mst 1
Machine State Save Area [F00000002FF47600]
iar : 000000000013E3D8 msr : 80000000000010B2 cr : 24424224
lr : 0000000000000000 ctr : 0000000000000000 xer : 20000000
mq : FFFFFFFF asr : FFFFFFFFFFFFFFFF
r0 : 0000000000248000 r1 : F00000002FF471B0 r2 : 000000000153F278
r3 : 00000000D03B6324 r4 : 200000000000D0B2 r5 : 00000000FFFFFFFF
r6 : 000000002FF218F0 r7 : 0000000000001003 r8 : 00000000510000F0
r9 : 0000000000000001 r10 : F00000002FF47600 r11 : 0000000000000000
r12 : 00000000002D94E4 r13 : F1000100101A0400 r14 : F100070F00008400
r15 : 000000000003F0D2 r16 : F100010017EEF400 r17 : 00000000510000F0
r18 : 0000000000000000 r19 : 0000000000000001 r20 : 0000000000000000
r21 : FFFFFFFFFFFC0F2E r22 : F00000002FF47600 r23 : 0000000000000004
r24 : F100070F00000000 r25 : 0000000000000000 r26 : F100070F00000400
r27 : F1000100101A0400 r28 : 000000000003F0D2 r29 : F10001001019D800
r30 : F100070F10000100 r31 : 0000000000000004
This shows us register r28 with a hexadecimal value of 000000000003F0D2.
#kdb
(0)> f 1
pvthread+000100 STACK:
[000540DC]et_wait+0002B0 (00000000D0128060, 200000000000D0B2,
00000000FFFFFFFF [??])
[004B024C]pause+000038 ()
[00003810].svc_instr+000110 ()
[D012805C]_p_pause+00007C ()
[10000C14]main+000884 (??, ??)
[10000198]__start+000098 ()
We can create an init hang by using a simple sleep script.
#vi /tmp/sleep.scr
#!/bin/ksh
echo "Getty will not start now."
sleep 3000
#vi /etc/inittab
Add the following line to inittab to simulate a hang.
hang:23456789:wait:/tmp/sleep.scr >/dev/console 2>&1
A) On a 64-bit system burritobso
===================================
burritobso:/:# bootinfo -K
64
burritobso:/:#cat /etc/inittab |grep sleep.scr
hang:23456789:wait:/tmp/sleep.scr >/dev/console 2>&1
I have added this line on purpose to /etc/inittab and rebooted system.
This should simulate what customers usually see when the getty process does not start.
I do not have a login prompt on console, lets see if we can find out what is causing the hang.
burritobso:/:#kdb
(0)> f 1
pvthread+000100 STACK:
[0013E3D8]_kwaitpid+00038C (00000000D03B6324, 200000000000D0B2,
00000000FFFFFFFF, 000000002FF218F0, 0000000000001003, 00000000510000F0,
0000000000000001 [??])
[0013F18C]kwaitpid+00007C (??, ??, ??, ??, ??)
[00003810].svc_instr+000110 ()
[D03B6320]waitpid+00016C (??, ??, ??)
[10002D74]waitproc+0000D4 (??)
[10002374]spawn+00017C ()
[10000ADC]main+00074C (??, ??)
[10000198]__start+000098 ()
We can see init stuck in _kwaitpid.
(0)> dc _kwaitpid 80|grep r4
._kwaitpid+0000A0 extsw r28,r4
._kwaitpid+0000B4 li r4,0
._kwaitpid+0000E0 srdi r4,r5,C4
._kwaitpid+0000E4 or r3,r3,r4
._kwaitpid+0000E8 clrlwi r4,r5,1F
._kwaitpid+0000EC or r3,r3,r4
._kwaitpid+000120 lis r4,1
The extsw line shows that register r4 has saved its value in r28. Now, we must determine the value of register r28 and get the PID.
(0)> mst 1
Machine State Save Area [F00000002FF47600]
iar : 000000000013E3D8 msr : 80000000000010B2 cr : 24424224
lr : 0000000000000000 ctr : 0000000000000000 xer : 20000000
mq : FFFFFFFF asr : FFFFFFFFFFFFFFFF
r0 : 0000000000248000 r1 : F00000002FF471B0 r2 : 000000000153F278
r3 : 00000000D03B6324 r4 : 200000000000D0B2 r5 : 00000000FFFFFFFF
r6 : 000000002FF218F0 r7 : 0000000000001003 r8 : 00000000510000F0
r9 : 0000000000000001 r10 : F00000002FF47600 r11 : 0000000000000000
r12 : 00000000002D94E4 r13 : F1000100101A0400 r14 : F100070F00008400
r15 : 000000000003F0D2 r16 : F100010017EEF400 r17 : 00000000510000F0
r18 : 0000000000000000 r19 : 0000000000000001 r20 : 0000000000000000
r21 : FFFFFFFFFFFC0F2E r22 : F00000002FF47600 r23 : 0000000000000004
r24 : F100070F00000000 r25 : 0000000000000000 r26 : F100070F00000400
r27 : F1000100101A0400 r28 : 000000000003F0D2 r29 : F10001001019D800
r30 : F100070F10000100 r31 : 0000000000000004
This shows us register r28 with a hexadecimal value of 000000000003F0D2.
Use hcal to convert to decimal.
(0)> hcal 000000000003F0D2
Value hexa: 0003F0D2 Value decimal: 258258
(0)> q
The PID which caused the init hang, and prevented the getty start or respawn, is 258258. Now, we can look for this PID in the process table.
burritobso:/:# ps -ef |grep 258258
UID PID PPID C STIME TTY TIME CMD
root 258258 1 0 17:59:35 - 0:00 /bin/ksh /tmp/sleep.scr
Now, we need to comment out the line in /etc/inittab that is causing hang. Use a colon ":" for comments in inittab.
#vi /etc/inittab
:hang:23456789:wait:/tmp/sleep.scr >/dev/console 2>&1
Save the /etc/inittab and reboot the system.
If getty still does not come up, there might be another line causing similar behavior.
B) On a 32-bit system advo5bso
===========================
#bootinfo -K
32
ps -ef |grep getty
root 13162 19392 0 13:14:46 pts/1 0:00 grep getty
#kdb
(0)> f 1
pvthread+000080 STACK:
[000E4B4C]_kwaitpid+00033C (E2000334, 00000000, 024A8400, 00000103,
00000103, 510000F0, 00000001, FFFEFBFF [??])
[000E5A1C]kwaitpid+0000B4 (??, ??, ??, ??, ??)
[00003B14].sys_call+000000 ()
[D02842D0]waitpid+00016C (??, ??, ??)
[10002D4C]waitproc+0000D4 (??)
[1000234C]spawn+00017C ()
[10000ADC]main+00074C (??, ??)
[10000198]__start+000098 ()
(0)> dc kwaitpid 40|grep r4
.kwaitpid+000028 ori r30,r4,0
.kwaitpid+00008C ori r4,r31,0
(0)> set 10
10 display_stacked_regs true
(0)> f 1
pvthread+000080 STACK:
[000E4B4C]_kwaitpid+00033C (E2000334, 00000000, 024A8400, 00000103,
00000103, 510000F0, 00000001, FFFEFBFF [??])
r20 : 200296D8 r21 : 200296E0 r22 : 00000048 r23 : E2000200 r24 : 00000000
r25 : 2FF3B400 r26 : 00000000 r27 : 00000000 r28 : 00000000 r29 : 00000000
r30 : 0000428C r31 : 2FF219C0
[000E5A1C]kwaitpid+0000B4 (??, ??, ??, ??, ??)
r28 : 024A7800 r29 : 024A8400 r30 : 02622000 r31 : 00000000
[00003B14].sys_call+000000 ()
[D02842D0]waitpid+00016C (??, ??, ??)
r31 : 00000000
[10002D4C]waitproc+0000D4 (??)
r31 : 20000DA4
[1000234C]spawn+00017C ()
r25 : 000000C0 r26 : 20001164 r27 : 0000000A r28 : 200296D4 r29 : 00000004
r30 : 00000000 r31 : 200004B8
[10000ADC]main+00074C (??, ??)
r18 : DEADBEEF r19 : DEADBEEF r20 : DEADBEEF r21 : DEADBEEF r22 : DEADBEEF
r23 : DEADBEEF r24 : DEADBEEF r25 : DEADBEEF r26 : DEADBEEF r27 : 0000000A
r28 : 200003E0 r29 : 10000000 r30 : FFFFFFFB r31 : 200004A8
[10000198]__start+000098 ()
(0)> hcal 0000428C
Value hexa: 0000428C Value decimal: 17036
(0)> q
# ps -ef |grep 17036
root 13168 19392 0 13:16:51 pts/1 0:00 grep 17036
root 17036 1 0 13:12:09 - 0:00 /bin/ksh /tmp/sleep.scr
root 18320 17036 0 13:12:09 - 0:00 sleep 3000
# cat /etc/inittab |grep sleep
hang:23456789:wait:/tmp/sleep.scr >/dev/console 2>&1
(0)> hcal 000000000003F0D2
Value hexa: 0003F0D2 Value decimal: 258258
(0)> q
The PID which caused the init hang, and prevented the getty start or respawn, is 258258. Now, we can look for this PID in the process table.
burritobso:/:# ps -ef |grep 258258
UID PID PPID C STIME TTY TIME CMD
root 258258 1 0 17:59:35 - 0:00 /bin/ksh /tmp/sleep.scr
Now, we need to comment out the line in /etc/inittab that is causing hang. Use a colon ":" for comments in inittab.
#vi /etc/inittab
:hang:23456789:wait:/tmp/sleep.scr >/dev/console 2>&1
Save the /etc/inittab and reboot the system.
If getty still does not come up, there might be another line causing similar behavior.
B) On a 32-bit system advo5bso
===========================
#bootinfo -K
32
ps -ef |grep getty
root 13162 19392 0 13:14:46 pts/1 0:00 grep getty
#kdb
(0)> f 1
pvthread+000080 STACK:
[000E4B4C]_kwaitpid+00033C (E2000334, 00000000, 024A8400, 00000103,
00000103, 510000F0, 00000001, FFFEFBFF [??])
[000E5A1C]kwaitpid+0000B4 (??, ??, ??, ??, ??)
[00003B14].sys_call+000000 ()
[D02842D0]waitpid+00016C (??, ??, ??)
[10002D4C]waitproc+0000D4 (??)
[1000234C]spawn+00017C ()
[10000ADC]main+00074C (??, ??)
[10000198]__start+000098 ()
(0)> dc kwaitpid 40|grep r4
.kwaitpid+000028 ori r30,r4,0
.kwaitpid+00008C ori r4,r31,0
(0)> set 10
10 display_stacked_regs true
(0)> f 1
pvthread+000080 STACK:
[000E4B4C]_kwaitpid+00033C (E2000334, 00000000, 024A8400, 00000103,
00000103, 510000F0, 00000001, FFFEFBFF [??])
r20 : 200296D8 r21 : 200296E0 r22 : 00000048 r23 : E2000200 r24 : 00000000
r25 : 2FF3B400 r26 : 00000000 r27 : 00000000 r28 : 00000000 r29 : 00000000
r30 : 0000428C r31 : 2FF219C0
[000E5A1C]kwaitpid+0000B4 (??, ??, ??, ??, ??)
r28 : 024A7800 r29 : 024A8400 r30 : 02622000 r31 : 00000000
[00003B14].sys_call+000000 ()
[D02842D0]waitpid+00016C (??, ??, ??)
r31 : 00000000
[10002D4C]waitproc+0000D4 (??)
r31 : 20000DA4
[1000234C]spawn+00017C ()
r25 : 000000C0 r26 : 20001164 r27 : 0000000A r28 : 200296D4 r29 : 00000004
r30 : 00000000 r31 : 200004B8
[10000ADC]main+00074C (??, ??)
r18 : DEADBEEF r19 : DEADBEEF r20 : DEADBEEF r21 : DEADBEEF r22 : DEADBEEF
r23 : DEADBEEF r24 : DEADBEEF r25 : DEADBEEF r26 : DEADBEEF r27 : 0000000A
r28 : 200003E0 r29 : 10000000 r30 : FFFFFFFB r31 : 200004A8
[10000198]__start+000098 ()
(0)> hcal 0000428C
Value hexa: 0000428C Value decimal: 17036
(0)> q
# ps -ef |grep 17036
root 13168 19392 0 13:16:51 pts/1 0:00 grep 17036
root 17036 1 0 13:12:09 - 0:00 /bin/ksh /tmp/sleep.scr
root 18320 17036 0 13:12:09 - 0:00 sleep 3000
# cat /etc/inittab |grep sleep
hang:23456789:wait:/tmp/sleep.scr >/dev/console 2>&1
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"Not Applicable","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]
Was this topic helpful?
Document Information
Modified date:
10 December 2019
UID
isg3T1011224