AIX transactional memory programming
Transactional memory (TM) is a shared-memory synchronization construction that allows process-threads to perform storage operations that appear to be atomic to other process-threads or applications.
Overview
TM is a construct that allows execution of lock-based critical sections of the code without acquiring a lock. The IBM® POWER 8 processor is the first processor that implements TM programming.
- Optimistic Execution of Lock-Based Applications – TM supports speculative execution of critical sections of code without acquiring a lock. This method provides the benefits of fine-grained locking to applications by using the current locks that are not tuned for performance.
- Transactional Programming in High-Level Languages – The transactional programming model is a growing industry-wide standard that offers productivity gains relative to lock-based shared memory programs.
- Checkpoint/Rollback Usage – TM is used as a checkpoint to restore architectural state. This method enables speculative compiler optimizations during runtime code optimization or generation and simulation of checkpoints.
To use the TM facility, a process-thread marks the beginning
and end of the sequence of storage accesses or transaction with the tbegin.
and tend.
instructions.
The tbegin.
instruction initiates the transactional
execution, during which the loads and stores appear to occur atomically.
The tend.
instruction ends the transactional execution.
If
a transaction is prematurely stopped, the storage updates that were
made after executing the tbegin.
instruction are
rolled back. Correspondingly, the contents of a subset of the registers
are also rolled back to the state before the tbegin.
instruction
was executed. When a transaction is prematurely stopped, a software
failure handler is started. The failure can be of the transient type
or the persistent type. The failure handler can retry the transaction
or choose to employ a different locking construct or logic path that
depends on the nature of the failure.
The AIX® operating system supports the usage of TM including handling of TM state management across context switches and interrupts.
Checkpoint state
When a transaction is initiated
a set of registers are saved that represent the checkpoint state of
the processor. In case of a transaction failure, a set of registers
are restored to the point before the start of the transaction. The
checkpoint state of the processor is also called as the pre-transactional
state. The checkpoint state includes the problem-state writable registers
except for the CR0
, FXCC
, EBBHR
, EBBRR
, BESCR
registers,
the performance monitor registers, and the TM SPRs.
treclaim.
instruction
is executed. This process allows privileged code to save or modify
the values. The checkpoint state is copied back into the speculative
registers from the respective user-accessible registers after the
execution of the new trechkpt.
instruction.Name | Title | Description | Privileged mtspr | Privileged mfspr | Size (bits) | SPR |
---|---|---|---|---|---|---|
FSCR | Facility Status and Control Register | Controls the available facilities in problem state and indicates the cause of a Facility unavailable interrupt. | yes | yes | 64 | 153 |
TEXASR | Transaction Exception And Summary Register | Contains the transaction level and summary information that is used by the transaction failure handlers. The 0:31 bits contain the cause of the failure. | no | no | 64 | 130 |
TFHAR | Transaction Failure Handler Address Register | Records the EA of the software failure handler. The TFHAR register is always
set to the NIA for the tbegin. instruction that initiated the transaction. |
no | no | 64 | 128 |
TFIAR | Transaction Failure Instruction Address Register | Set to the exact EA of the instruction that causes the failure, when possible. The accuracy of the TFIAR register is recorded in the Exact field (bit 37) of the TEXASR register. | no | no | 64 | 129 |
TEXASRU | Transaction Exception and Summary Register (Upper Half) | High-order half of TEXASR register. | no | no | 32 | 131 |
Name | Field | Value-Meaning | Bits |
---|---|---|---|
TEXASR | Failure Code (Note: bit 7 is referred to as the Failure persistent field) | Transaction Failure Codes | 0:7 |
Disallowed | 0b1 - The access type instruction is not allowed | 8 | |
Nesting Overflow | 0b1 - The maximum transaction level was exceeded. | 9 | |
Footprint Overflow | 0b1 - The tracking limit for transactional storage accesses was exceeded. | 10 | |
Self-Induced Conflict | 0b1 - A self-induced conflict occurred in suspended state. | 11 | |
Non-Transactional Conflict | 0b1 - A conflict occurred with a non-transactional access by another processor. | 12 | |
Transaction Conflict | 0b1 - A conflict occurred with another transaction. | 13 | |
Translation Invalidation Conflict | 0b1 - A conflict occurred with a TLB invalidation. | 14 | |
Implementation Specific | 0b1 - An implementation-specific condition caused the transaction to fail. | 15 | |
Instruction Fetch Conflict | 0b1 - An instruction fetch by the thread or another thread that was performed from a block that was previously written transactionally. | 16 | |
Reserved for future failure cases | 17:30 | ||
Abort | 0b1 – An abort was caused by the execution of a particular TM instruction. | 31 | |
Suspended | 0b1 – The failure was recorded in Suspended State. | 32 | |
Reserved | 33 | ||
Privilege | The thread was in the privilege state ([MSRHV||PR] )
at the time of failure recording. |
34:35 | |
Failure Summary (FS) | 0b1 - A failure was detected and recorded. | 36 | |
TFIAR Exact | 0b0 - The value in the TFIAR field is an approximate value. 0b1 - The value in the TFIAR field is an exact value. |
37 | |
ROT | Set to 0b0 when a non-ROT Set to 0b1 when a ROT is initiated. |
38 | |
Reserved | 39:51 | ||
Transaction Level (TL) | Transaction level (nesting depth + 1) for the
active transaction has the following values:
Note: A value of 1 corresponds to an outer transaction.
A value greater than 1 corresponds to a nested transaction.
|
52:63 |
- Exactly 1 bit of the 8-31 bits of the
TEXASR
register is set when the transaction failure is recorded. The single bit that is set indicates that the particular instruction or event caused failure. - A Rollback Only Transaction (ROT) is a sequence of instructions that is executed either as a unit or the instructions are not executed. This construct allows for the speculative execution of a bulk of instructions with minimal cost. A ROT does not have the full atomic nature as a normal transaction or its synchronization and serialization properties. Therefore, ROTs must not be used to manipulate shared data.
Software failure handler
tbegin.
instruction,
CR0 is set to either0b101 || 0
or0b010 || 0
tbegin.
instruction must
be a branch instruction predicated on bit 2 of CR0. For example, after
the tbegin.
instruction is executed, the beq
branch
instruction predicated on bit 2 of CR0. The target of the branch must
be a section of code that handles transaction failures. When the tbegin.
instruction
is successfully run at the start of the transaction, CR0 is set to
either 0b000 || 0 or 0b010 || 0
TEXASR
reports the cause of the failure.
The failure code (FC) field in bits 0-7, is used for the following
scenarios:- Privileged supervisor or hypervisor code causes the failure by
using the
treclaim.
instruction. - Problem-state code causes the failure by using a form of the
tabort.
instruction.
TEXASR
indicates that
the failure is persistent and the transaction is bound to fail when
the transaction is attempted again. The failure codes reserved by
the AIX operating system indicate
the cause of the failure that are defined in the /usr/include/sys/machine.h. directory.A sample transaction
lock_based_update
label.
The code for the alternate path is not shown. trans_entry:
tbegin # Start transaction
beq failure_hdlr # Handle transaction failure
stw r5, 0(r4) # Write to memory pointed to by r4.
tend. # End transaction
b trans_exit
failure_hdlr: # Handle transaction failures:
mfspr r4, TEXASRU # Read high-order half of TEXASR
andis. r5, r4, 0x0100 # Is the failure persistent?
bne lock_based_update # If persistent, acquire lock and
# then perform the write.
b trans_entry # If transient, try again.
lock_based_update:
trans_exit:
Runtime determination of Transactional Memory capability
A
program can determine whether a system supports the TM category of
the POWER ISA
by reading the SC_TM_VER
system variable using the getsystemcfg
subroutine.
A __power_tm()
macro is provided in the /usr/include/sys/systemcfg.h file
to determine the TM capability within a program. This macro is useful
for software that conditionally uses the TM capability when it is
present, or uses the functionally equivalent to lock-based code paths
when the TM capability is not present.
Extended context structure
The earlier versions of the AIX operating system introduced support for extended context structures to support the vector state and user keys. The existing extended context structure support is further extended to support machine state that is required by TM.
An extended context is allocated and pinned for each
transactional process-thread when it first uses TM. If the extended
context area cannot be allocated and pinned, then the process receives
a SIGSEGV
signal that results in termination of the
process.
The machine-context information is included in the sigcontext
structure
that is provided to signal handlers. When a signal handler returns,
the machine context present in the sigcontext
structure
is activated. The sigcontext
structure is actually
a subset of the larger ucontext
structure. The two
structures are identical up to sizeof(struct sigcontext)
.
When the AIX operating system
builds a signal context to be passed to a signal handler, a ucontext
structure
is built on the stack of the signal handler. The machine-context portion
of a signal context must contain all of the active machine state,
including the volatile and nonvolatile state for the involuntarily
interrupted context. The ucontext
structure contains
an indicator to determine whether extended context information is
available.
The __extctx
field in the ucontext
structure
is the address of an extended context structure is defined in the /usr/include/sys/context.h file.
The __extctx_magic
field in the ucontext
structure
indicates whether the extended context information is valid when the
value of __extctx_magic
field is equal to __EXTCTX_MAGIC
.
The additional machine state for a thread that uses the TM capability
is restored and saved as a member of the context extension, to the ucontext
structure
as a part of the signal delivery and return.
If an application
chooses to explicitly enable the use of Transactional Memory, it takes
an extended size ucontext
structure that already
has space for the __extctx
field that is included
by the implicit definition of __EXTABI__
by the compiler.
The extended ucontext
structure can also be picked
up by an explicit definition of __AIXEXTABI
.
The getcontext()
, setcontext()
, makecontext()
,
and swapcontext()
subroutines of libc are not supported
while in transactional or suspended state. When the subroutines are
called within transaction, the getcontext()
, setcontext()
, makecontext()
subroutines
result in a persistent transaction failure of TM_LIBC
type,
which is defined in the /usr/include/sys/machine.h file.
swapcontext()
subroutine is called within a transaction,
it results in the following behavior: - When a
swapcontext()
subroutine is in transactional state, it results in a persistent transaction failure ofTM_LIBC
type. - When a
swapcontext()
subroutine is in suspended state, it results in the transaction being doomed, the specifieducontext
structure swapped in and execution of the program is resumed by the specifieducontext
structure. The resulting state and subsequent behavior after theswapcontext()
subroutine returns are undefined.
If the getcontext()
, setcontext()
,
and swapcontext()
subroutines are called in a non-transactional
state, the subroutines do not retrieve or restore any extended TM
context into or from the ucontext
structure pointed
to by the ucp
or oucp
parameters.
No error is indicated when the setcontext()
or swapcontext()
subroutines
are called with the extended TM context present.
See the /usr/include/sys/context.h header file for detailed information of the extended context.
Signal delivery
Asynchronous signals that
are received by an application while in a transaction is delivered
non-transactionally. When in transactional state, the delivery of
synchronous signals is not allowed and instead results in a persistent
transaction failure of TM_SYNC_SIGNAL
type, as defined
in the /usr/include/sys/machine.h file.
Alignment interrupts and program interrupts
In
transactional state, the alignment interrupts and program interrupts
are caused due to an illegal operation or an operation that requires
emulation result in a persistent transaction failure of TM_ALIGN_INT
type
or TM_INV_OP
type, as defined in the /usr/include/sys/machine.h file.
When in suspended state, alignment and program interrupts are processed
normally by using the non-speculative semantics.
System calls
It is suggested that the system
calls are not invoked within a transaction. System calls are only
supported within a transaction when the transaction is suspended through
the tsuspend.
instruction.
When a system call
is invoked while a processor or thread is transactional and the transaction
is not suspended, the system call is not invoked by the AIX kernel and the associated transaction fails
persistently. When this error occurs, the FC field of the TEXASR
register
contains the TM_ILL_SC
failure code, which is defined
in the /usr/sys/include/machine.h file.
It is assumed that any operations performed under a suspended transaction when the application programmer has explicitly suspended the transaction are intended to be persistent. Any operations that are performed by a system call that is invoked while in suspended state is not rolled-back even if the transaction fails.
The AIX operating system does not support system calls to be made while in transactional state because there is no way to roll back any operations, including I/O, performed by AIX underneath a system call.
setjmp() and longjmp() subroutines
setjmp()
and longjmp()
subroutines
of libc are not supported in transactional or suspended state because
of the effects of setting a jump buffer and jumping back to the buffer.
Consider the following scenarios- If the
setjmp()
subroutine is called inside of a transaction and the correspondinglongjmp()
subroutine is called after the transaction ends, the jump is to a speculative state that is now invalid. - If the
setjmp()
subroutine is called before the transaction, a correspondinglongjmp()
subroutine goes to the state before the transaction started regardless of whether the transaction has ended, failed, or aborted. - If the
setjmp()
subroutine is called within a transaction and then the transaction is aborted, the updates that are made to the jump buffer by thesetjmp()
subroutine will not appear to have occurred.
When the setjmp()
subroutine is called
within a transaction, it results in a persistent transaction failure
either of TM_LIBC
type or TM_ILL_SC
type
that is defined in the /usr/include/sys/machine.h file.
longjmp()
subroutine is called within a transaction,
it results in the following behavior:- When a
longjmp()
subroutine is in a transactional state, it results in a persistent transaction failure ofTM_LIBC
type orTM_ILL_SC
type that is defined in the /usr/include/sys/machine.h file. - When a
longjmp()
subroutine is in a suspended state, it results in the transaction being doomed, the specified jump buffer is restored, and execution of the program returns to the correspondingsetjmp()
subroutine. The resulting state and subsequent behavior are undefined after thesetjmp()
subroutine returns from thelongjmp()
subroutine .
Compilers
An AIX operating
system compiler that supports the Transactional Memory must conform
to the AIX ABI. When the TM
is enabled, a C or C++ compiler must predefine __EXTABI__
.
Refer to the compiler documentation for detailed information.
Assembler
The AIX operating system assembler, in the /usr/ccs/bin/as supports the additional instruction set defined for the TM in the POWER ISA, and as implemented by the POWER8 processor. You can use the –m pwr8 assembly mode or the .machine pwr8 pseudo op within the source file to enable the assembly of TM instructions. For more information, refer to the assembly language reference.
Debugger
The /usr/ccs/bin/dbx debugger
supports machine-level debugging of TM programs. This support includes
the ability to disassemble the new Transactional Memory instructions
and to view the TM SPRs: TEXASR
, TEXASRU
, TFIAR
,
and TFHAR
registers.
Setting a breakpoint inside
of a transaction causes the transaction to unconditionally fail whenever
the breakpoint is encountered. Therefore, the suggested approach is
to debug a transaction that is failing to set a breakpoint on the
transaction’s failure handler and then view the TEXASR
and TFIAR
registers
when the breakpoint is encountered to determine the cause and location
of the failure.
TEXASR
, TFIAR
,
and TFHAR
registers can be viewed by using the print
subcommand with the $texasr
, $tfiar
,
or $tfhar
parameter. The line of code that is associated
with the address found in the TFIAR
, and TFHAR
registers
can be viewed through the list subcommand, for example: (dbx) list at $tfiar
The
dbx tm_status
subcommand is used to view and interpret
the contents of the TEXASR
register. This subcommand
is used to determine the nature of a transaction failure.Enablement
for third-party debuggers is provided in the form of a new PTT_READ_TM
ptrace
operation for reading the TM state of a thread. Refer
to the ptrace
documentation for details.
Tracing support
The AIX trace facility is expanded to include a set of trace events for TM operations that are performed by the AIX operating system including pre-emption causing transaction failure and other various operations that can cause transaction failure. The trace event identifier 675 can be used as input to the trace and trcrpt commands to view TM-related trace events.
Core files
AIX threads library
The use of Transactional Memory is not supported
for applications that use M:N threads. Undefined behavior may occur
in transactional threads in an environment where more than one thread
shares a single kernel thread. Usage of Transactional Memory by an
application that uses M:N threads may lead to a persistent transaction
failure with the Failure Code of TM_PTH_PREEMPTED
being
set in the TEXASR
register.