Kernel extension customization
During the AIX Live Update operation, kernel extensions can be impacted. The Dynamic Logical Partitioning (DLPAR) platform is used to communicate the operation progress between the Live Update operation and kernel extensions.
The following table describes the kernel extension states in the original partition and the surrogate partition during each phase:
Phases | Original partition | Surrogate partition |
---|---|---|
check | Kernel extensions are notified at the same time as applications. Any data on the
orig-rootvg environment is copied to the surr-boot-rootvg
environment when the data is created. |
Kernel extensions are notified at the same time as applications. Checkpointed data is
available on both the surr-boot-rootvg and surr-mir-rootvg volume
groups because of mirroring. The surr-mir-rootvg device is available only after the
pre phase. |
pre | Kernel extensions are notified after applications are checkpointed. The checkpointed data
must be saved to the orig-rootvg volume group. Because of mirroring, the data is
also available on the surr-mir-rootvg volume group. The data becomes available in
the chrooted environment for the surrogate partition after the splitvg operation
that occurs only after DLPAR notification. After a restart of the surrogate partition, kernel
extensions need to account for the change of the location of the file. If the old path is
x, the new path is /old/x. |
Kernel extensions are notified when file systems of the surr-mir-rootvg
volume group are mounted. The data that is collected on the original partition's pre phase is
available only in the chrooted environment (after the root directory is changed). Applications that
are on surrogate partition must be aware of the availability of the chrooted environment. |
post | This notification is sent to applications when applications are started on the surrogate partition. | This notification is sent to applications when applications are started on the surrogate partition. |
post-error | Kernel extensions can take appropriate action. | Gives kernel extensions the opportunity to respond to the Live Update failure depending on in which phase the post-error occurs. |
If a kernel extension expects that the DLPAR handling operation takes a long time, the handler must return DR_WAIT to the caller, and proceed with the request asynchronously. When the request is completed, the handler must call the reconfig_complete() kernel service.
Application state located in kernel extensions must be considered from the related kernel extensions. The related kernel extensions need to checkpoint such application states when the applications are checkpointed and reload them with the right state when the applications are restarted.
Device considerations
When the surrogate partition is started, the devices must be configured similar to the
configuration on the original partition. The same device on the original partition and the surrogate
partition must have the same name, the same device number (devno
(major, minor)),
and the same device configuration.
Some devices might have customized attributes that are modified in Object Data Manager (ODM), but not taken effect (these changes take effect at reboot time of the LPAR). When the surrogate partition is booted, the customized attributes take effect. The storage devices might not have the same multipathing topology on the surrogate partition as the original partition.
Kernel extensions in mobility
Kernel extensions need special considerations for mobility so that the workload is not interrupted. For most kernel extensions, unloading them on the original partition and reloading them on the surrogate partition suffice.
Safe kernel extensions
By default, all kernel extensions that are loaded on the original partition must be identified as
safe for the Live Update operations
unless you have overridden it with the kext_check
setting in the
/var/adm/ras/liveupdate/lvupdate.data file.
Generally, a kernel extension is safe for the Live Update operation if the kernel extension is aware of the Live Update operation or does not need to be aware of the Live Update operation. A kernel extension is deemed to be Live Update safe if it meets one of the following requirements:
- The kernel extension is loaded with the SYS_LUSAFE flag.
- The kernel extension name is in the /etc/liveupdate/lvup_SafeKE file.
To mark the kernel extension as Live Update safe, the kernel extensions can be loaded by using the sysconfig() call with the SYS_LUSAFE flag that is defined in the sys/sysconfig.h file.
In some safe kernel extensions, the SYS_LUSAFE flag might not be set. You can mark them as safe for a Live Update operation by using the lvupdateSafeKE command.
Safe kernel extensions are listed in the /etc/liveupdate/lvup_safeKE file. Duplication is not allowed in this list. Each kernel extension must be listed with its full path.
In all modes, it is always validated that the loaded kernel extensions are safe, even when you choose not to enforce the requirement. In this case, the Live Update operation logs the non-compliant kernel extensions, but continues to operate.
Loading kernel extensions
When the surrogate partition is started, it loads only those kernel extensions that are related to devices that are configured. Normal commands that usually start during the regular initialization of an LPAR might not start. As a result, some kernel extensions that are needed by checkpointed applications might not be loaded when the applications are restarted. The Live Update framework offers more than one mechanism to handle such situation:
- Applications with kernel extensions can be enabled for checkpoint if they manage the loading and unloading of the kernel extensions. The unloading must occur before the freezing of the applications and you can load the kernel extensions when applications are restarted.
- Kernel extensions can be preloaded on the surrogate partition before the applications are restarted. The Live Update framework offers a registration mechanism. All loading methods that are registered for the Live Update operation are executed before the applications are restarted. The lvupdateRegKE command can be used to add or remove kernel extensions to be preloaded.
- The full path of the kernel extension is needed. In a loading error, the Live Update operation is stopped.
Example for interaction between a process and a kernel extension
This example shows how the interaction between a process and a kernel extension must be handled. The goal of the Live Update operation is to preserve the behavior of workloads in the update process.
Suppose that an application comprises a test_process
process and a
test_ke
kernel extension. The test_ke
kernel extension has a
variable counter that is used to count some events. The test_process
process reads
counter from test_ke
and consumes it during its execution. When
test_ke
is loaded, the counter is initialized to 0. The counter's value increases
with time. In the Live Update operation, when
test_process
is checkpointed, its process state is saved, but the counter value is
not saved. Since kernel extensions are not checkpointed, you must ensure that the counter is
preserved when it is loaded on the surrogate partition. This function is supported by the DLPAR
framework in the Live Update operation.
- Applications are checkpointed on the original partition.
- A notification is sent to the kernel extensions at the pre phase.
- The
test_ke
kernel extension uses the reconfig_register_list() kernel service to register reconfiguration handlers for DLPAR events. - In the handler for the pre phase, the counter is saved in the /var/adm/ras/liveupdate/kext/test_ke file. This file is located on rootvg so that it can be transferred to the surrogate partition after the partition is mirrored.
- On the surrogate partition, the pre phase is sent to kernel extensions after the
surr-mirr-rootvg
environment is mounted. It means that the saved data for thetest_ke
kernel extension including variable counter is now available. The state of thetest_ke
kernel extension can be reconfigured to match the state when it was saved.