Register usage and conventions

The PowerPC® 32-bit architecture has 32 GPRs and 32 FPRs.

The PowerPC® 32-bit architecture has 32 GPRs and 32 FPRs. Each GPR is 32 bits wide, and each FPR is 64 bits wide. There are also special registers for branching, exception handling, and other purposes. The General-Purpose Register Convention table shows how GPRs are used.

Table 1. General-Purpose Register Conventions
Register	Status	Use
GPR0	volatile	In function prologs.
GPR1	dedicated	Stack pointer.
GPR2	dedicated	Table of Contents (TOC) pointer.
GPR3	volatile	First word of a function's argument list; first word of a scalar function return.
GPR4	volatile	Second word of a function's argument list; second word of a scalar function return.
GPR5	volatile	Third word of a function's argument list.
GPR6	volatile	Fourth word of a function's argument list.
GPR7	volatile	Fifth word of a function's argument list.
GPR8	volatile	Sixth word of a function's argument list.
GPR9	volatile	Seventh word of a function's argument list.
GPR10	volatile	Eighth word of a function's argument list.
GPR11	volatile	In calls by pointer and as an environment pointer for languages that require it (for example, PASCAL).
GPR12	volatile	For special exception handling required by certain languages and in glink code.
GPR13	reserved	Reserved under 64-bit environment; not restored across system calls.
GPR14:GPR31	nonvolatile	These registers must be preserved across a function call.

The preferred method of using GPRs is to use the volatile registers first. Next, use the nonvolatile registers in descending order, starting with GPR31. GPR1 and GPR2 must be dedicated as stack and Table of Contents (TOC) area pointers, respectively. GPR1 and GPR2 must appear to be saved across a call, and must have the same values at return as when the call was made.

Volatile registers are scratch registers presumed to be destroyed across a call and are, therefore, not saved by the callee. Volatile registers are also used for specific purposes as shown in the previous table. Nonvolatile and dedicated registers are required to be saved and restored if altered and, thus, are guaranteed to retain their values across a function call.

The Floating-Point Register Conventions table shows how the FPRs are used.

Table 2. Floating-Point Register Conventions
Register	Status	Use
FPR0	volatile	As a scratch register.
FPR1	volatile	First floating-point parameter; first 8 bytes of a floating-point scalar return.
FPR2	volatile	Second floating-point parameter; second 8 bytes of a floating-point scalar return.
FPR3	volatile	Third floating-point parameter; third 8 bytes of a floating-point scalar return.
FPR4	volatile	Fourth floating-point parameter; fourth 8 bytes of a floating-point scalar return.
FPR5	volatile	Fifth floating-point parameter.
FPR6	volatile	Sixth floating-point parameter.
FPR7	volatile	Seventh floating-point parameter.
FPR8	volatile	Eighth floating-point parameter.
FPR9	volatile	Ninth floating-point parameter.
FPR10	volatile	Tenth floating-point parameter.
FPR11	volatile	Eleventh floating-point parameter.
FPR12	volatile	Twelfth floating-point parameter.
FPR13	volatile	Thirteenth floating-point parameter.
FPR14:FPR31	nonvolatile	If modified, must be preserved across a call.

The preferred method of using FPRs is to use the volatile registers first. Next, the nonvolatile registers are used in descending order, starting with FPR31 and proceeding down to FPR14.

Only scalars are returned in multiple registers. The number of registers required depends on the size and type of the scalar. For floating-point values, the following results occur:

A 128-bit floating-point value returns the high-order 64 bits in FPR1 and the low-order 64 bits in FPR2.
An 8-byte or 16-byte complex value returns the real part in FPR1 and the imaginary part in FPR2.
A 32-byte complex value returns the real part as a 128-bit floating-point value in FPR1 and FPR2, with the high-order 64 bits in FPR1 and the low-order 64 bits in FPR2. The imaginary part of a 32-byte complex value returns the high-order 64 bits in FPR3 and the low-order 64 bits in FPR4.

Example of calling convention for complex types


complex double foo(complex double);

Arguments are passed into fp1 and fp2 and the results are returned in fp1 and fp2. Subsequent complex double parameters are passed in the next two available registers, up to fp13, by using either even-odd or odd-even pairs. After fp13 they are passed in a parameter area in the memory located in the beginning of the caller's stack frame.

Note: The skipped registers are not used for later parameters. In addition, these registers are not initialized by the caller and the called function must not depend on the value stored within the skipped registers.

A single precision complex (complex float) is passed the same way as double precision with the values widened to double precision.

Double and single precision complex (complex double and complex float) are returned in fp1 and fp2 with single precision values widened to double precision.

Quadruple precision complex (complex long double) parameters are passed in the next four available registers, from fp1 to fp13 and then in the parameter area. The order in which the registers fill is, upper half of the real part, lower half of the real part, upper half of the imaginary part, and lower half of the imaginary part.

Note: In AIX structs, classes and unions are passed in gprs (or memory) and not fprs. This is true even if the classes and unions contain floating point values. In Linux on PPC the address of a copy in memory is passed in the next available gpr (or in memory). The varargs parameters are specifically handled and generally passed to both fprs and gprs.

Calling convention for decimal floating-point types (_Decimal128)

_Decimal64 parameters are passed in the next available fpr and the results returned in fp1.

_Decimal32 parameters are passed in the lower half of the next available fpr and the results are returned in the lower half of fp1, without being converted to _Decimal64.

_Decimal128 parameters are passed in the next available even-odd fpr pair (or memory) even if that means skipping a register and the results are returned in the even-odd pair fpr2 and fpr3. The reason is that all the arithmetic instructions require the use of even-odd register pairs.

Unlike float or double, with DFP, a function prototype is always required. Hence _Decimal32 is not required to be widened to _Decinmal64.

Example of calling convention for decimal floating-point type (_Decimal32)


#include <float.h>
#define DFP_ROUND_HALF_UP 4

_Decimal32 Add_GST_and_Ontario_PST_d32 (_Decimal32 price)
{
_Decimal32 gst;
_Decimal32 pst;
_Decimal32 total;
long original_rounding_mode = __dfp_get_rounding_mode ( );
__dfp_set_rounding_mode (DFP_ROUND_HALF_UP);
gst = price * 0.06dd;
pst = price * 0.08dd;
total = price + gst + pst;
__dfp_set_rounding_mode (original_rounding_mode);
return (total);
}

| 000000 PDEF Add_GST_and_Ontario_PST_d32
>> 0| PROC price,fp1
0| 000000 stw 93E1FFFC 1 ST4A #stack(gr1,-4)=gr31
0| 000004 stw 93C1FFF8 1 ST4A #stack(gr1,-8)=gr30
0| 000008 stwu 9421FF80 1 ST4U gr1,#stack(gr1,-128)=gr1
0| 00000C lwz 83C20004 1 L4A gr30=.+CONSTANT_AREA(gr2,0)
0| 000010 addi 38A00050 1 LI gr5=80
0| 000014 ori 60A30000 1 LR gr3=gr5
>> 0| 000018 stfiwx 7C211FAE 1 STDFS price(gr1,gr3,0)=fp1
9| 00001C mffs FC00048E 1 LFFSCR fp0=fcr
9| 000020 stfd D8010058 1 STFL #MX_SET1(gr1,88)=fp0
9| 000024 lwz 80010058 1 L4A gr0=#MX_SET1(gr1,88)
9| 000028 rlwinm 5400077E 1 RN4 gr0=gr0,0,0x7
9| 00002C stw 9001004C 1 ST4A original_rounding_mode(gr1,76)=gr0
10| 000030 mtfsfi FF81410C 1 SETDRND fcr=4,fcr
11| 000034 ori 60A30000 1 LR gr3=gr5
11| 000038 lfiwax 7C011EAE 1 LDFS fp0=price(gr1,gr3,0)
11| 00003C dctdp EC000204 1 CVDSDL fp0=fp0,fcr
11| 000040 lfd C83E0000 1 LDFL fp1=+CONSTANT_AREA(gr30,0)
11| 000044 dmul EC000844 1 MDFL fp0=fp0,fp1,fcr
11| 000048 drsp EC000604 1 CVDLDS fp0=fp0,fcr
11| 00004C addi 38600040 1 LI gr3=64
11| 000050 ori 60640000 1 LR gr4=gr3
11| 000054 stfiwx 7C0127AE 1 STDFS gst(gr1,gr4,0)=fp0
12| 000058 ori 60A40000 1 LR gr4=gr5
12| 00005C lfiwax 7C0126AE 1 LDFS fp0=price(gr1,gr4,0)
12| 000060 dctdp EC000204 1 CVDSDL fp0=fp0,fcr
12| 000064 lfd C83E0008 1 LDFL fp1=+CONSTANT_AREA(gr30,8)
12| 000068 dmul EC000844 1 MDFL fp0=fp0,fp1,fcr
12| 00006C drsp EC000604 1 CVDLDS fp0=fp0,fcr
12| 000070 addi 38800044 1 LI gr4=68
12| 000074 ori 60860000 1 LR gr6=gr4
12| 000078 stfiwx 7C0137AE 1 STDFS pst(gr1,gr6,0)=fp0
13| 00007C lfiwax 7C012EAE 1 LDFS fp0=price(gr1,gr5,0)
13| 000080 lfiwax 7C211EAE 1 LDFS fp1=gst(gr1,gr3,0)
13| 000084 dctdp EC000204 1 CVDSDL fp0=fp0,fcr
13| 000088 dctdp EC200A04 1 CVDSDL fp1=fp1,fcr
13| 00008C mffs FC40048E 1 LFFSCR fp2=fcr
13| 000090 stfd D8410058 1 STFL #MX_SET1(gr1,88)=fp2
13| 000094 lwz 80010058 1 L4A gr0=#MX_SET1(gr1,88)
13| 000098 rlwinm 5400077E 1 RN4 gr0=gr0,0,0x7
13| 00009C mtfsfi FF81710C 1 SETDRND fcr=7,fcr
13| 0000A0 dadd EC000804 1 ADFL fp0=fp0,fp1,fcr
13| 0000A4 stw 90010058 1 ST4A #MX_SET1(gr1,88)=gr0
13| 0000A8 lfd C8210058 1 LFL fp1=#MX_SET1(gr1,88)
13| 0000AC mtfsf FC030D8E 1 LFSCR8 fsr,fcr=fp1,1,1
13| 0000B0 addi 38000007 1 LI gr0=7
13| 0000B4 addi 38600000 1 LI gr3=0
13| 0000B8 stw 90610068 1 ST4A #MX_CONVF1_0(gr1,104)=gr3
13| 0000BC stw 9001006C 1 ST4A #MX_CONVF1_0(gr1,108)=gr0
13| 0000C0 lfd C8210068 1 LDFL fp1=#MX_CONVF1_0(gr1,104)
13| 0000C4 drrnd EC010646 1 RRDFL fp0=fp0,fp1,3,fcr
13| 0000C8 drsp EC000604 1 CVDLDS fp0=fp0,fcr
13| 0000CC lfiwax 7C2126AE 1 LDFS fp1=pst(gr1,gr4,0)
13| 0000D0 dctdp EC000204 1 CVDSDL fp0=fp0,fcr
13| 0000D4 dctdp EC200A04 1 CVDSDL fp1=fp1,fcr
13| 0000D8 mffs FC40048E 1 LFFSCR fp2=fcr
13| 0000DC stfd D8410058 1 STFL #MX_SET1(gr1,88)=fp2
13| 0000E0 lwz 80810058 1 L4A gr4=#MX_SET1(gr1,88)
13| 0000E4 rlwinm 5484077E 1 RN4 gr4=gr4,0,0x7
13| 0000E8 mtfsfi FF81710C 1 SETDRND fcr=7,fcr
13| 0000EC dadd EC000804 1 ADFL fp0=fp0,fp1,fcr
13| 0000F0 stw 90810058 1 ST4A #MX_SET1(gr1,88)=gr4
13| 0000F4 lfd C8210058 1 LFL fp1=#MX_SET1(gr1,88)
13| 0000F8 mtfsf FC030D8E 1 LFSCR8 fsr,fcr=fp1,1,1
13| 0000FC stw 90610068 1 ST4A #MX_CONVF1_0(gr1,104)=gr3
13| 000100 stw 9001006C 1 ST4A #MX_CONVF1_0(gr1,108)=gr0
13| 000104 lfd C8210068 1 LDFL fp1=#MX_CONVF1_0(gr1,104)
13| 000108 drrnd EC010646 1 RRDFL fp0=fp0,fp1,3,fcr
13| 00010C drsp EC000604 1 CVDLDS fp0=fp0,fcr
13| 000110 addi 38600048 1 LI gr3=72
13| 000114 ori 60640000 1 LR gr4=gr3
13| 000118 stfiwx 7C0127AE 1 STDFS total(gr1,gr4,0)=fp0
14| 00011C lwz 8001004C 1 L4A gr0=original_rounding_mode(gr1,76)
14| 000120 stw 90010058 1 ST4A #MX_SET1(gr1,88)=gr0
14| 000124 lfd C8010058 1 LFL fp0=#MX_SET1(gr1,88)
14| 000128 mtfsf FC03058E 1 LFSCR8 fsr,fcr=fp0,1,1
>> 15| 00012C lfiwax 7C211EAE 1 LDFS fp1=total(gr1,gr3,0)
16| CL.1:
16| 000130 lwz 83C10078 1 L4A gr30=#stack(gr1,120)
16| 000134 addi 38210080 1 AI gr1=gr1,128
16| 000138 bclr 4E800020 1 BA lr

Example of calling convention for decimal floating-point type (_Decimal64)


#include <float.h>
#define DFP_ROUND_HALF_UP 4

_Decimal64 Add_GST_and_Ontario_PST_d64 (_Decimal64 price)
{
_Decimal64 gst;
_Decimal64 pst;
_Decimal64 total;
long original_rounding_mode = __dfp_get_rounding_mode ( );
__dfp_set_rounding_mode (DFP_ROUND_HALF_UP);
gst = price * 0.06dd;
pst = price * 0.08dd;
total = price + gst + pst;
__dfp_set_rounding_mode (original_rounding_mode);
return (total);
}

| 000000 PDEF Add_GST_and_Ontario_PST_d64
>> 0| PROC price,fp1
0| 000000 stw 93E1FFFC 1 ST4A #stack(gr1,-4)=gr31
0| 000004 stw 93C1FFF8 1 ST4A #stack(gr1,-8)=gr30
0| 000008 stwu 9421FF80 1 ST4U gr1,#stack(gr1,-128)=gr1
0| 00000C lwz 83C20004 1 L4A gr30=.+CONSTANT_AREA(gr2,0)
>> 0| 000010 stfd D8210098 1 STDFL price(gr1,152)=fp1
9| 000014 mffs FC00048E 1 LFFSCR fp0=fcr
9| 000018 stfd D8010060 1 STFL #MX_SET1(gr1,96)=fp0
9| 00001C lwz 80010060 1 L4A gr0=#MX_SET1(gr1,96)
9| 000020 rlwinm 5400077E 1 RN4 gr0=gr0,0,0x7
9| 000024 stw 90010058 1 ST4A original_rounding_mode(gr1,88)=gr0
10| 000028 mtfsfi FF81410C 1 SETDRND fcr=4,fcr
11| 00002C lfd C8010098 1 LDFL fp0=price(gr1,152)
11| 000030 lfd C83E0000 1 LDFL fp1=+CONSTANT_AREA(gr30,0)
11| 000034 dmul EC000844 1 MDFL fp0=fp0,fp1,fcr
11| 000038 stfd D8010040 1 STDFL gst(gr1,64)=fp0
12| 00003C lfd C8010098 1 LDFL fp0=price(gr1,152)
12| 000040 lfd C83E0008 1 LDFL fp1=+CONSTANT_AREA(gr30,8)
12| 000044 dmul EC000844 1 MDFL fp0=fp0,fp1,fcr
12| 000048 stfd D8010048 1 STDFL pst(gr1,72)=fp0
13| 00004C lfd C8010098 1 LDFL fp0=price(gr1,152)
13| 000050 lfd C8210040 1 LDFL fp1=gst(gr1,64)
13| 000054 dadd EC000804 1 ADFL fp0=fp0,fp1,fcr
13| 000058 lfd C8210048 1 LDFL fp1=pst(gr1,72)
13| 00005C dadd EC000804 1 ADFL fp0=fp0,fp1,fcr
13| 000060 stfd D8010050 1 STDFL total(gr1,80)=fp0
14| 000064 lwz 80010058 1 L4A gr0=original_rounding_mode(gr1,88)
14| 000068 stw 90010060 1 ST4A #MX_SET1(gr1,96)=gr0
14| 00006C lfd C8010060 1 LFL fp0=#MX_SET1(gr1,96)
14| 000070 mtfsf FC03058E 1 LFSCR8 fsr,fcr=fp0,1,1
>> 15| 000074 lfd C8210050 1 LDFL fp1=total(gr1,80)
16| CL.1:
16| 000078 lwz 83C10078 1 L4A gr30=#stack(gr1,120)
16| 00007C addi 38210080 1 AI gr1=gr1,128
16| 000080 bclr 4E800020 1 BA lr

Example of calling convention for decimal floating-point type (_Decimal128)


include <float.h>
#define DFP_ROUND_HALF_UP 4

_Decimal128 Add_GST_and_Ontario_PST_d128 (_Decimal128 price)
{
_Decimal128 gst;
_Decimal128 pst;
_Decimal128 total;
long original_rounding_mode = __dfp_get_rounding_mode ( );
__dfp_set_rounding_mode (DFP_ROUND_HALF_UP);
gst = price * 0.06dd;
pst = price * 0.08dd;
total = price + gst + pst;
__dfp_set_rounding_mode (original_rounding_mode);
return (total);
}

| 000000 PDEF Add_GST_and_Ontario_PST_d128
>> 0| PROC price,fp2,fp3
0| 000000 stw 93E1FFFC 1 ST4A #stack(gr1,-4)=gr31
0| 000004 stw 93C1FFF8 1 ST4A #stack(gr1,-8)=gr30
0| 000008 stwu 9421FF70 1 ST4U gr1,#stack(gr1,-144)=gr1
0| 00000C lwz 83C20004 1 L4A gr30=.+CONSTANT_AREA(gr2,0)
>> 0| 000010 stfd D84100A8 1 STDFL price(gr1,168)=fp2
>> 0| 000014 stfd D86100B0 1 STDFL price(gr1,176)=fp3
9| 000018 mffs FC00048E 1 LFFSCR fp0=fcr
9| 00001C stfd D8010078 1 STFL #MX_SET1(gr1,120)=fp0
9| 000020 lwz 80010078 1 L4A gr0=#MX_SET1(gr1,120)
9| 000024 rlwinm 5400077E 1 RN4 gr0=gr0,0,0x7
9| 000028 stw 90010070 1 ST4A original_rounding_mode(gr1,112)=gr0
10| 00002C mtfsfi FF81410C 1 SETDRND fcr=4,fcr
11| 000030 lfd C80100A8 1 LDFL fp0=price(gr1,168)
11| 000034 lfd C82100B0 1 LDFL fp1=price(gr1,176)
11| 000038 lfd C85E0000 1 LDFL fp2=+CONSTANT_AREA(gr30,0)
11| 00003C lfd C87E0008 1 LDFL fp3=+CONSTANT_AREA(gr30,8)
11| 000040 dmulq FC001044 1 MDFE fp0,fp1=fp0-fp3,fcr
11| 000044 stfdp F4010040 1 STDFE gst(gr1,64)=fp0,fp1
12| 000048 lfd C80100A8 1 LDFL fp0=price(gr1,168)
12| 00004C lfd C82100B0 1 LDFL fp1=price(gr1,176)
12| 000050 lfd C85E0010 1 LDFL fp2=+CONSTANT_AREA(gr30,16)
12| 000054 lfd C87E0018 1 LDFL fp3=+CONSTANT_AREA(gr30,24)
12| 000058 dmulq FC001044 1 MDFE fp0,fp1=fp0-fp3,fcr
12| 00005C stfdp F4010050 1 STDFE pst(gr1,80)=fp0,fp1
13| 000060 lfd C80100A8 1 LDFL fp0=price(gr1,168)
13| 000064 lfd C82100B0 1 LDFL fp1=price(gr1,176)
13| 000068 lfd C8410040 1 LDFL fp2=gst(gr1,64)
13| 00006C lfd C8610048 1 LDFL fp3=gst(gr1,72)
13| 000070 daddq FC001004 1 ADFE fp0,fp1=fp0-fp3,fcr
13| 000074 lfd C8410050 1 LDFL fp2=pst(gr1,80)
13| 000078 lfd C8610058 1 LDFL fp3=pst(gr1,88)
13| 00007C daddq FC001004 1 ADFE fp0,fp1=fp0-fp3,fcr
13| 000080 stfdp F4010060 1 STDFE total(gr1,96)=fp0,fp1
14| 000084 lwz 80010070 1 L4A gr0=original_rounding_mode(gr1,112)
14| 000088 stw 90010078 1 ST4A #MX_SET1(gr1,120)=gr0
14| 00008C lfd C8010078 1 LFL fp0=#MX_SET1(gr1,120)
14| 000090 mtfsf FC03058E 1 LFSCR8 fsr,fcr=fp0,1,1
>> 15| 000094 lfd C8410060 1 LDFL fp2=total(gr1,96)
>> 15| 000098 lfd C8610068 1 LDFL fp3=total(gr1,104)
16| CL.1:
16| 00009C lwz 83C10088 1 L4A gr30=#stack(gr1,136)
16| 0000A0 addi 38210090 1 AI gr1=gr1,144
16| 0000A4 bclr 4E800020 1 BA lr