Notification frameworks

Most applications do not need to be aware of the AIX® Live Update operation. During the Live Update operation, an application is checkpointed after the application receives a checkpoint signal. During the checkpointing process, the mobility mechanism takes over the application and saves the application-specific resources, then re-creates the application on the surrogate partition. When the resources are restored, the application resumes its operations. All applications are checkpointed at the same time and restarted at the same time.

Some applications need to interact with the Live Update operation. Such applications can use the Dynamic Logical Partitioning (DLPAR) framework. When the Live Update operation starts on the original partition, applications are notified during the check phase. The applications can use the dr_reconfig() system call to acknowledge the Live Update operation before the Live Update timeout (60 seconds). This timeout provides time to applications to prepare itself for the DLPAR event.

During the check phase, an application can query the dr_info structure for details about the DLPAR event such as the type of event and the current phase. For the Live Update event, the origin of the notification (the original partition or the surrogate partition) can also be queried. An application can use a DR_EVENT_FAIL event to stop the Live Update operation during the check phase, if the application cannot survive a checkpoint or restart at that time. Due to the timing of the check notification on the surrogate partition, the DR_EVENT_FAIL event applies to only those applications that are started from the inittab process on the surr-boot-rootvg volume group.

Before the applications are checkpointed on the original partition, a DLPAR notification is sent to applications during the pre phase. When the mobility operation is done and applications are restarted on the surrogate partition, a DLPAR notification is sent to applications during the post phase at both the original and surrogate partitions. Only base processes can see the post event on the original partition. Applications that are moved to the surrogate partition receive the post notification in the surrogate partition. If an error occurs, a DLPAR notification is sent to the applications during the post-error phase.

Dynamic reconfiguration or DLPAR framework

The Live Update operation is registered as a Dynamic Reconfiguration (DR) or Dynamic Logical Partitioning (DLPAR) operation. It means that when the Live Update operation is running, no other DLPAR operation can be performed, and when any DLPAR operation is in progress, the Live Update operation cannot be started. Therefore, the configuration of the original LPAR is preserved during the Live Update operation. The DLPAR operations resume after the Live Update operation completes.

The DLPAR framework is also used to inform applications, kernel, and kernel extensions of the Live Update operation. The DLPAR framework supports the following phases:

  • check
  • pre
  • post
  • post-error

A notification is sent to applications, kernel, or kernel extensions at each of these four phases. If applications and kernel extensions are integrated into the DLPAR framework, the applications and kernel extensions can interact with the Live Update operation.

Integration with DLPAR

The applications integrate with the DLPAR framework in the following methods: By handling the SIGRECONFIG signal. Within the signal handler, the dr_reconfig() subroutine can be used to query and acknowledge the DLPAR event. The handler must reconfigure the application.

Another method is to install a set of DLPAR scripts. These scripts are started when a DLPAR event occurs, and must be designed to respond to Live Update operation aptly. Applications must reconfigure itself when they receive DLPAR notification.

Kernel extensions use the reconfig_register_list() kernel service to register reconfiguration handlers for DLPAR events. These handlers are called when DLPAR events occur.

Live Update support in DLPAR

The Live Update operation introduces a new DLPAR event.

The dr_op field of the dr_info structure is set to DR_OP_LVUPD for a Live Update event. The field in the dr_info structure that indicates the origin of the DLPAR notification is defined in the sys/dr.h file as follows:
ushort lvup
When the dr_reconfig() subroutine is called for the Live Update event, the lvup bit is set to LIVEUPDTORIG (the original partition is the origin of the DLPAR notification) or LIVEUPDTSURR (the surrogate partition is the origin of the DLPAR notification). These values are defined in the dr.h file as follows:
#define LIVEUPDTORIG			0x1
#define LIVEUPDTSURR			0x2

Alternative to DLPAR

The DLPAR or DR framework does not enforce an order of execution of scripts within the same phase. If the subsystems rely on synchronization of their operations during a specific phase, these subsystems must implement the synchronization among itself.

To save these subsystems from having to implement a synchronization mechanism, the Live Update framework provides an alternative notification system. The lvupdateRegScript command can be used to register a specific script with a priority.

The priority can be an integer value in the range 1 - 10. For more information about priorities, see the timeline table in the Timeline to run the DLPAR scripts topic. During the Live Update operation, before the check event is issued, the scripts that are registered with the LVUP_CHECK event are executed; the order of execution starts with scripts with the highest priority to the lowest priority. The same methodology is applied to the rest of the phases. The script must be registered only once, during the installation of the application.

The script owner must specify whether the script must be registered and run on the original partition or the surrogate partition. The Live Update operation fails if a script fails during the LVUP_CHECK or LVUP_PRE events.