Kernel extension customization

During the AIX Live Update operation, kernel extensions can be impacted. The Dynamic Logical Partitioning (DLPAR) platform is used to communicate the operation progress between the Live Update operation and kernel extensions.

The following table describes the kernel extension states in the original partition and the surrogate partition during each phase:

Phases Original partition Surrogate partition
check Kernel extensions are notified at the same time as applications. Any data on the orig-rootvg environment is copied to the surr-boot-rootvg environment when the data is created. Kernel extensions are notified at the same time as applications. Checkpointed data is available on both the surr-boot-rootvg and surr-mir-rootvg volume groups because of mirroring. The surr-mir-rootvg device is available only after the pre phase.
pre Kernel extensions are notified after applications are checkpointed. The checkpointed data must be saved to the orig-rootvg volume group. Because of mirroring, the data is also available on the surr-mir-rootvg volume group. The data becomes available in the chrooted environment for the surrogate partition after the splitvg operation that occurs only after DLPAR notification. After a restart of the surrogate partition, kernel extensions need to account for the change of the location of the file. If the old path is x, the new path is /old/x. Kernel extensions are notified when file systems of the surr-mir-rootvg volume group are mounted. The data that is collected on the original partition's pre phase is available only in the chrooted environment (after the root directory is changed). Applications that are on surrogate partition must be aware of the availability of the chrooted environment.
post This notification is sent to applications when applications are started on the surrogate partition. This notification is sent to applications when applications are started on the surrogate partition.
post-error Kernel extensions can take appropriate action. Gives kernel extensions the opportunity to respond to the Live Update failure depending on in which phase the post-error occurs.

If a kernel extension expects that the DLPAR handling operation takes a long time, the handler must return DR_WAIT to the caller, and proceed with the request asynchronously. When the request is completed, the handler must call the reconfig_complete() kernel service.

Application state located in kernel extensions must be considered from the related kernel extensions. The related kernel extensions need to checkpoint such application states when the applications are checkpointed and reload them with the right state when the applications are restarted.

Device considerations

When the surrogate partition is started, the devices must be configured similar to the configuration on the original partition. The same device on the original partition and the surrogate partition must have the same name, the same device number (devno (major, minor)), and the same device configuration.

Some devices might have customized attributes that are modified in Object Data Manager (ODM), but not taken effect (these changes take effect at reboot time of the LPAR). When the surrogate partition is booted, the customized attributes take effect. The storage devices might not have the same multipathing topology on the surrogate partition as the original partition.

Kernel extensions in mobility

Kernel extensions need special considerations for mobility so that the workload is not interrupted. For most kernel extensions, unloading them on the original partition and reloading them on the surrogate partition suffice.

Safe kernel extensions

By default, all kernel extensions that are loaded on the original partition must be identified as safe for the Live Update operations unless you have overridden it with the kext_check setting in the /var/adm/ras/liveupdate/lvupdate.data file.

Generally, a kernel extension is safe for the Live Update operation if the kernel extension is aware of the Live Update operation or does not need to be aware of the Live Update operation. A kernel extension is deemed to be Live Update safe if it meets one of the following requirements:

  • The kernel extension is loaded with the SYS_LUSAFE flag.
  • The kernel extension name is in the /etc/liveupdate/lvup_SafeKE file.

To mark the kernel extension as Live Update safe, the kernel extensions can be loaded by using the sysconfig() call with the SYS_LUSAFE flag that is defined in the sys/sysconfig.h file.

In some safe kernel extensions, the SYS_LUSAFE flag might not be set. You can mark them as safe for a Live Update operation by using the lvupdateSafeKE command.

Safe kernel extensions are listed in the /etc/liveupdate/lvup_safeKE file. Duplication is not allowed in this list. Each kernel extension must be listed with its full path.

In all modes, it is always validated that the loaded kernel extensions are safe, even when you choose not to enforce the requirement. In this case, the Live Update operation logs the non-compliant kernel extensions, but continues to operate.

Loading kernel extensions

When the surrogate partition is started, it loads only those kernel extensions that are related to devices that are configured. Normal commands that usually start during the regular initialization of an LPAR might not start. As a result, some kernel extensions that are needed by checkpointed applications might not be loaded when the applications are restarted. The Live Update framework offers more than one mechanism to handle such situation:

  • Applications with kernel extensions can be enabled for checkpoint if they manage the loading and unloading of the kernel extensions. The unloading must occur before the freezing of the applications and you can load the kernel extensions when applications are restarted.
  • Kernel extensions can be preloaded on the surrogate partition before the applications are restarted. The Live Update framework offers a registration mechanism. All loading methods that are registered for the Live Update operation are executed before the applications are restarted. The lvupdateRegKE command can be used to add or remove kernel extensions to be preloaded.
  • The full path of the kernel extension is needed. In a loading error, the Live Update operation is stopped.

Example for interaction between a process and a kernel extension

This example shows how the interaction between a process and a kernel extension must be handled. The goal of the Live Update operation is to preserve the behavior of workloads in the update process.

Suppose that an application comprises a test_process process and a test_ke kernel extension. The test_ke kernel extension has a variable counter that is used to count some events. The test_process process reads counter from test_ke and consumes it during its execution. When test_ke is loaded, the counter is initialized to 0. The counter's value increases with time. In the Live Update operation, when test_process is checkpointed, its process state is saved, but the counter value is not saved. Since kernel extensions are not checkpointed, you must ensure that the counter is preserved when it is loaded on the surrogate partition. This function is supported by the DLPAR framework in the Live Update operation.

  1. Applications are checkpointed on the original partition.
  2. A notification is sent to the kernel extensions at the pre phase.
  3. The test_ke kernel extension uses the reconfig_register_list() kernel service to register reconfiguration handlers for DLPAR events.
  4. In the handler for the pre phase, the counter is saved in the /var/adm/ras/liveupdate/kext/test_ke file. This file is located on rootvg so that it can be transferred to the surrogate partition after the partition is mirrored.
  5. On the surrogate partition, the pre phase is sent to kernel extensions after the surr-mirr-rootvg environment is mounted. It means that the saved data for the test_ke kernel extension including variable counter is now available. The state of the test_ke kernel extension can be reconfigured to match the state when it was saved.