AIX transactional memory programming

Transactional memory (TM) is a shared-memory synchronization construction that allows process-threads to perform storage operations that appear to be atomic to other process-threads or applications.

Overview

TM is a construct that allows execution of lock-based critical sections of the code without acquiring a lock. The IBM® POWER 8 processor is the first processor that implements TM programming.

Use the TM facility in some of the following scenarios:
  • Optimistic Execution of Lock-Based Applications – TM supports speculative execution of critical sections of code without acquiring a lock. This method provides the benefits of fine-grained locking to applications by using the current locks that are not tuned for performance.
  • Transactional Programming in High-Level Languages – The transactional programming model is a growing industry-wide standard that offers productivity gains relative to lock-based shared memory programs.
  • Checkpoint/Rollback Usage – TM is used as a checkpoint to restore architectural state. This method enables speculative compiler optimizations during runtime code optimization or generation and simulation of checkpoints.

To use the TM facility, a process-thread marks the beginning and end of the sequence of storage accesses or transaction with the tbegin. and tend. instructions. The tbegin. instruction initiates the transactional execution, during which the loads and stores appear to occur atomically. The tend. instruction ends the transactional execution.

If a transaction is prematurely stopped, the storage updates that were made after executing the tbegin. instruction are rolled back. Correspondingly, the contents of a subset of the registers are also rolled back to the state before the tbegin. instruction was executed. When a transaction is prematurely stopped, a software failure handler is started. The failure can be of the transient type or the persistent type. The failure handler can retry the transaction or choose to employ a different locking construct or logic path that depends on the nature of the failure.

The AIX® operating system supports the usage of TM including handling of TM state management across context switches and interrupts.

Checkpoint state

When a transaction is initiated a set of registers are saved that represent the checkpoint state of the processor. In case of a transaction failure, a set of registers are restored to the point before the start of the transaction. The checkpoint state of the processor is also called as the pre-transactional state. The checkpoint state includes the problem-state writable registers except for the CR0, FXCC, EBBHR, EBBRR, BESCR registers, the performance monitor registers, and the TM SPRs.

Note: The checkpoint state cannot be directly accessed through the supervisor state or the problem state.
The checkpoint state is copied into the respective registers after the new treclaim. instruction is executed. This process allows privileged code to save or modify the values. The checkpoint state is copied back into the speculative registers from the respective user-accessible registers after the execution of the new trechkpt. instruction.
The following TM SPRs are added to the machine state for the processor:
Name Title Description Privileged mtspr Privileged mfspr Size (bits) SPR
FSCR Facility Status and Control Register Controls the available facilities in problem state and indicates the cause of a Facility unavailable interrupt. yes yes 64 153
TEXASR Transaction Exception And Summary Register Contains the transaction level and summary information that is used by the transaction failure handlers. The 0:31 bits contain the cause of the failure. no no 64 130
TFHAR Transaction Failure Handler Address Register Records the EA of the software failure handler. The TFHAR register is always set to the NIA for the tbegin. instruction that initiated the transaction. no no 64 128
TFIAR Transaction Failure Instruction Address Register Set to the exact EA of the instruction that causes the failure, when possible. The accuracy of the TFIAR register is recorded in the Exact field (bit 37) of the TEXASR register. no no 64 129
TEXASRU Transaction Exception and Summary Register (Upper Half) High-order half of TEXASR register. no no 32 131
The new TEXASR register contains information related to the state of a transaction and the cause of a transaction failure. The following table describes the fields included in the TEXASR register:
Name Field Value-Meaning Bits
TEXASR Failure Code (Note: bit 7 is referred to as the Failure persistent field) Transaction Failure Codes 0:7
Disallowed 0b1 - The access type instruction is not allowed 8
Nesting Overflow 0b1 - The maximum transaction level was exceeded. 9
Footprint Overflow 0b1 - The tracking limit for transactional storage accesses was exceeded. 10
Self-Induced Conflict 0b1 - A self-induced conflict occurred in suspended state. 11
Non-Transactional Conflict 0b1 - A conflict occurred with a non-transactional access by another processor. 12
Transaction Conflict 0b1 - A conflict occurred with another transaction. 13
Translation Invalidation Conflict 0b1 - A conflict occurred with a TLB invalidation. 14
Implementation Specific 0b1 - An implementation-specific condition caused the transaction to fail. 15
Instruction Fetch Conflict 0b1 - An instruction fetch by the thread or another thread that was performed from a block that was previously written transactionally. 16
Reserved for future failure cases   17:30
Abort 0b1 – An abort was caused by the execution of a particular TM instruction. 31
Suspended 0b1 – The failure was recorded in Suspended State. 32
Reserved   33
Privilege The thread was in the privilege state ([MSRHV||PR]) at the time of failure recording. 34:35
Failure Summary (FS) 0b1 - A failure was detected and recorded. 36
TFIAR Exact

0b0 - The value in the TFIAR field is an approximate value.

0b1 - The value in the TFIAR field is an exact value.

37
ROT

Set to 0b0 when a non-ROT tbegin. instruction is executed.

Set to 0b1 when a ROT is initiated.

38
Reserved   39:51
Transaction Level (TL) Transaction level (nesting depth + 1) for the active transaction has the following values:
  • 0 if the most recent transaction completed successfully.
  • The transaction level at which the most recent transaction failed, if the transaction did not complete successfully.
Note: A value of 1 corresponds to an outer transaction. A value greater than 1 corresponds to a nested transaction.
52:63
Notes:
  • Exactly 1 bit of the 8-31 bits of the TEXASR register is set when the transaction failure is recorded. The single bit that is set indicates that the particular instruction or event caused failure.
  • A Rollback Only Transaction (ROT) is a sequence of instructions that is executed either as a unit or the instructions are not executed. This construct allows for the speculative execution of a bulk of instructions with minimal cost. A ROT does not have the full atomic nature as a normal transaction or its synchronization and serialization properties. Therefore, ROTs must not be used to manipulate shared data.

Software failure handler

When a transaction fails, the machine hardware redirects control to the failure handler that is associated with the outermost transaction. When a transaction fails, the control is redirected to the instruction that follows the tbegin. instruction, CR0 is set to either
0b101 || 0
or
0b010 || 0
Therefore, the instruction after the tbegin. instruction must be a branch instruction predicated on bit 2 of CR0. For example, after the tbegin. instruction is executed, the beq branch instruction predicated on bit 2 of CR0. The target of the branch must be a section of code that handles transaction failures. When the tbegin. instruction is successfully run at the start of the transaction, CR0 is set to either
0b000 || 0 or 0b010 || 0
Note: The bits 0:31 of TEXASR reports the cause of the failure. The failure code (FC) field in bits 0-7, is used for the following scenarios:
  • Privileged supervisor or hypervisor code causes the failure by using the treclaim. instruction.
  • Problem-state code causes the failure by using a form of the tabort. instruction.
A value of l in bit 7 of TEXASR indicates that the failure is persistent and the transaction is bound to fail when the transaction is attempted again. The failure codes reserved by the AIX operating system indicate the cause of the failure that are defined in the /usr/include/sys/machine.h. directory.

A sample transaction

The following assembler code example shows a simple transaction that writes the value in GPR 5 into the address in GPR 4, which is assumed to be shared among multiple threads of execution. If the transaction fails due to a persistent cause, the code falls back to another code path at the lock_based_update label. The code for the alternate path is not shown.
trans_entry:
  tbegin                             # Start transaction
  beq          failure_hdlr          # Handle transaction failure
  stw          r5, 0(r4)             # Write to memory pointed to by r4.
  tend.                              # End transaction
  b            trans_exit
failure_hdlr:                        # Handle transaction failures:
  mfspr        r4, TEXASRU           # Read high-order half of TEXASR
  andis.       r5, r4, 0x0100        # Is the failure persistent?
  bne          lock_based_update     # If persistent, acquire lock and 
                                     # then perform the write.
  b            trans_entry           # If transient, try again.

lock_based_update:

trans_exit:

Runtime determination of Transactional Memory capability

A program can determine whether a system supports the TM category of the POWER ISA by reading the SC_TM_VER system variable using the getsystemcfg subroutine. A __power_tm() macro is provided in the /usr/include/sys/systemcfg.h file to determine the TM capability within a program. This macro is useful for software that conditionally uses the TM capability when it is present, or uses the functionally equivalent to lock-based code paths when the TM capability is not present.

Extended context structure

The earlier versions of the AIX operating system introduced support for extended context structures to support the vector state and user keys. The existing extended context structure support is further extended to support machine state that is required by TM.

An extended context is allocated and pinned for each transactional process-thread when it first uses TM. If the extended context area cannot be allocated and pinned, then the process receives a SIGSEGV signal that results in termination of the process.

The machine-context information is included in the sigcontext structure that is provided to signal handlers. When a signal handler returns, the machine context present in the sigcontext structure is activated. The sigcontext structure is actually a subset of the larger ucontext structure. The two structures are identical up to sizeof(struct sigcontext). When the AIX operating system builds a signal context to be passed to a signal handler, a ucontext structure is built on the stack of the signal handler. The machine-context portion of a signal context must contain all of the active machine state, including the volatile and nonvolatile state for the involuntarily interrupted context. The ucontext structure contains an indicator to determine whether extended context information is available.

The __extctx field in the ucontext structure is the address of an extended context structure is defined in the /usr/include/sys/context.h file. The __extctx_magic field in the ucontext structure indicates whether the extended context information is valid when the value of __extctx_magic field is equal to __EXTCTX_MAGIC. The additional machine state for a thread that uses the TM capability is restored and saved as a member of the context extension, to the ucontext structure as a part of the signal delivery and return.

If an application chooses to explicitly enable the use of Transactional Memory, it takes an extended size ucontext structure that already has space for the __extctx field that is included by the implicit definition of __EXTABI__ by the compiler. The extended ucontext structure can also be picked up by an explicit definition of __AIXEXTABI.

The getcontext(), setcontext(), makecontext(), and swapcontext() subroutines of libc are not supported while in transactional or suspended state. When the subroutines are called within transaction, the getcontext(), setcontext(), makecontext() subroutines result in a persistent transaction failure of TM_LIBC type, which is defined in the /usr/include/sys/machine.h file.

When a swapcontext() subroutine is called within a transaction, it results in the following behavior:
  • When a swapcontext() subroutine is in transactional state, it results in a persistent transaction failure of TM_LIBC type.
  • When a swapcontext() subroutine is in suspended state, it results in the transaction being doomed, the specified ucontext structure swapped in and execution of the program is resumed by the specified ucontext structure. The resulting state and subsequent behavior after the swapcontext() subroutine returns are undefined.

If the getcontext(), setcontext(), and swapcontext() subroutines are called in a non-transactional state, the subroutines do not retrieve or restore any extended TM context into or from the ucontext structure pointed to by the ucp or oucp parameters. No error is indicated when the setcontext() or swapcontext() subroutines are called with the extended TM context present.

See the /usr/include/sys/context.h header file for detailed information of the extended context.

Signal delivery

Asynchronous signals that are received by an application while in a transaction is delivered non-transactionally. When in transactional state, the delivery of synchronous signals is not allowed and instead results in a persistent transaction failure of TM_SYNC_SIGNAL type, as defined in the /usr/include/sys/machine.h file.

Alignment interrupts and program interrupts

In transactional state, the alignment interrupts and program interrupts are caused due to an illegal operation or an operation that requires emulation result in a persistent transaction failure of TM_ALIGN_INT type or TM_INV_OP type, as defined in the /usr/include/sys/machine.h file. When in suspended state, alignment and program interrupts are processed normally by using the non-speculative semantics.

System calls

It is suggested that the system calls are not invoked within a transaction. System calls are only supported within a transaction when the transaction is suspended through the tsuspend. instruction.

When a system call is invoked while a processor or thread is transactional and the transaction is not suspended, the system call is not invoked by the AIX kernel and the associated transaction fails persistently. When this error occurs, the FC field of the TEXASR register contains the TM_ILL_SC failure code, which is defined in the /usr/sys/include/machine.h file.

It is assumed that any operations performed under a suspended transaction when the application programmer has explicitly suspended the transaction are intended to be persistent. Any operations that are performed by a system call that is invoked while in suspended state is not rolled-back even if the transaction fails.

The AIX operating system does not support system calls to be made while in transactional state because there is no way to roll back any operations, including I/O, performed by AIX underneath a system call.

setjmp() and longjmp() subroutines

The setjmp() and longjmp() subroutines of libc are not supported in transactional or suspended state because of the effects of setting a jump buffer and jumping back to the buffer. Consider the following scenarios
  1. If the setjmp() subroutine is called inside of a transaction and the corresponding longjmp() subroutine is called after the transaction ends, the jump is to a speculative state that is now invalid.
  2. If the setjmp() subroutine is called before the transaction, a corresponding longjmp() subroutine goes to the state before the transaction started regardless of whether the transaction has ended, failed, or aborted.
  3. If the setjmp() subroutine is called within a transaction and then the transaction is aborted, the updates that are made to the jump buffer by the setjmp() subroutine will not appear to have occurred.

When the setjmp() subroutine is called within a transaction, it results in a persistent transaction failure either of TM_LIBC type or TM_ILL_SC type that is defined in the /usr/include/sys/machine.h file.

When a longjmp() subroutine is called within a transaction, it results in the following behavior:
  • When a longjmp() subroutine is in a transactional state, it results in a persistent transaction failure of TM_LIBC type or TM_ILL_SC type that is defined in the /usr/include/sys/machine.h file.
  • When a longjmp() subroutine is in a suspended state, it results in the transaction being doomed, the specified jump buffer is restored, and execution of the program returns to the corresponding setjmp() subroutine. The resulting state and subsequent behavior are undefined after the setjmp() subroutine returns from the longjmp() subroutine .

Compilers

An AIX operating system compiler that supports the Transactional Memory must conform to the AIX ABI. When the TM is enabled, a C or C++ compiler must predefine __EXTABI__. Refer to the compiler documentation for detailed information.

Assembler

The AIX operating system assembler, in the /usr/ccs/bin/as supports the additional instruction set defined for the TM in the POWER ISA, and as implemented by the POWER8 processor. You can use the –m pwr8 assembly mode or the .machine pwr8 pseudo op within the source file to enable the assembly of TM instructions. For more information, refer to the assembly language reference.

Debugger

The /usr/ccs/bin/dbx debugger supports machine-level debugging of TM programs. This support includes the ability to disassemble the new Transactional Memory instructions and to view the TM SPRs: TEXASR, TEXASRU, TFIAR, and TFHAR registers.

Setting a breakpoint inside of a transaction causes the transaction to unconditionally fail whenever the breakpoint is encountered. Therefore, the suggested approach is to debug a transaction that is failing to set a breakpoint on the transaction’s failure handler and then view the TEXASR and TFIAR registers when the breakpoint is encountered to determine the cause and location of the failure.

In dbx, the TEXASR, TFIAR, and TFHAR registers can be viewed by using the print subcommand with the $texasr, $tfiar, or $tfhar parameter. The line of code that is associated with the address found in the TFIAR, and TFHAR registers can be viewed through the list subcommand, for example:
(dbx) list at $tfiar
The dbx tm_status subcommand is used to view and interpret the contents of the TEXASR register. This subcommand is used to determine the nature of a transaction failure.

Enablement for third-party debuggers is provided in the form of a new PTT_READ_TM ptrace operation for reading the TM state of a thread. Refer to the ptrace documentation for details.

Tracing support

The AIX trace facility is expanded to include a set of trace events for TM operations that are performed by the AIX operating system including pre-emption causing transaction failure and other various operations that can cause transaction failure. The trace event identifier 675 can be used as input to the trace and trcrpt commands to view TM-related trace events.

Core files

The AIX operating system also supports the inclusion of TM machine state as part of the core file for processes or threads that uses TM. If a process or thread is using or used TM, the TM machine state is included in the core image for that thread.
Note: The TM state is only supported in the current core file formats for the AIX operating system. You can use the dbx command to read and view the TM machine state of a TM-enabled core file.

AIX threads library

The use of Transactional Memory is not supported for applications that use M:N threads. Undefined behavior may occur in transactional threads in an environment where more than one thread shares a single kernel thread. Usage of Transactional Memory by an application that uses M:N threads may lead to a persistent transaction failure with the Failure Code of TM_PTH_PREEMPTED being set in the TEXASR register.