Data management events

Data management events arrive on a session queue from any of the nodes in the GPFS cluster.

The source node of the event is identified by the ev_nodeid field in the header of each event message in the structure dm_eventmsg. The identification is the GPFS cluster data node number, which is attribute node_number in the mmsdrfs2 file for a PSSP node or mmsdrfs file for any other type of node.

Data Management events are generated only if the following two conditions are true:
  1. The event is enabled.
  2. It has a disposition.

A file operation will fail with the EIO error if there is no disposition for an event that is enabled and would otherwise be generated.

A list of enabled events can be associated individually with a file and globally with an entire file system. The XDSM standard leaves undefined the situation where the individual and the global event lists are in conflict. In GPFS, such conflicts are resolved by always using the individual event list, if it exists.
Note: The XDSM standard does not provide the means to remove the individual event list of a file. Thus, there is no way to enable or disable an event for an entire file system without explicitly changing each conflicting individual event list.

In GPFS, event lists are persistent.

Event dispositions are specified per file system and are not persistent. They must be set explicitly after the session is created.

Event generation mechanisms have limited capacity. In case resources are exceeded, new file operations will wait indefinitely for free resources.

File operations wait indefinitely for a response from synchronous events. The dmapiEventTimeout configuration attribute on the mmchconfig command, can be used to set a timeout on events that originate from NFS file operations. This is necessary because NFS servers have a limited number of threads that cannot be blocked for long periods of time. Refer to GPFS configuration attributes for DMAPI and Support for NFS.

The XDSM standard permits asynchronous events to be discarded at any time. In GPFS, asynchronous events are guaranteed when the system runs normally, but may be lost during abnormal conditions, such as failure of GPFS on the session node. Asynchronous events are delivered in a timely manner. That is, an asynchronous event is enqueued to the session before the corresponding file operation completes.

Figure 1 shows the flow of a typical synchronous event in a multiple-node GPFS environment. The numbered arrows in the figure correspond to the following steps:
  1. The user application on the source node performs a file operation on a GPFS file. The file operation thread generates a synchronous event and blocks, waiting for a response.
  2. GPFS on the source node sends the event to GPFS on the session node, according to the disposition for that event. The event is enqueued to the session queue on the session node.
  3. The Data Management application on the session node receives the event (using dm_get_events) and handles it.
  4. The Data Management application on the session node responds to the event (using dm_respond_event).
  5. GPFS on the session node sends the response to GPFS on the source node.
  6. GPFS on the source node passes the response to the file operation thread and unblocks it. The file operation continues.
Figure 1. Flow of a typical synchronous event in a multiple-node GPFS environment
This graphic depicts the typical flow of a synchronous event in a multiple-node GPFS environment. In a three node GPFS cluster: 1) The first node is a session node with GPFS and the Data Management API installed. 2) The second node is a source node with GPFS and the user application installed. 3) The third node has only GPFS installed on it and is therefore not defined to the Data Management API. The flow of communication for the event is: 1) The user application on the source node performs a file operation on a GPFS file. The file operation thread generates a synchronous event and blocks, waiting for a response. 2) GPFS on the source node sends the event to GPFS on the session node, according to the disposition for that event. The event is enqueued to the session queue on the session node. 3) The Data Management application on the session node receives the event (using dm_get_events) and handles it. 4) The Data Management application on the session node responds to the event (using dm_respond_event). 5) GPFS on the session node sends the response to GPFS on the source node. 6) GPFS on the source node passes the response to the file operation thread and unblocks it. The file operation continues.