clone (BPX1CLN, BPX4CLN) — Create a child process

Function

The clone service creates a new process, which is called a child process. This service is similar to the fork callable service but has more options such as allowing the new child process to run in a newly created namespace.

For the macro, see BPXYCLNP— Map clone syscall parameters.

Requirements

Operation Environment
Authorization Supervisor state or problem state, PSW key 8, TCB key 8
Dispatchable unit mode Task
Cross memory mode PASN = HASN
AMODE (BPX1CLN) 31-bit
AMODE (BPX4CLN) 64-bit
ASC mode Primary mode
Interrupt status Enabled for interrupts
Locks Unlocked
Control parameters All parameters must be addressable by the caller and in the primary address space.

Format

CALL BPX1CLN,(CLNP_length,
              CLNP,
              Process_ID,
              Return_code,
              Reason_code)

AMODE 64 callers use BPX4CLN with the same parameters. All parameter addresses and addresses in parameter structures are doublewords.

Parameters

Clnp_length
Supplied parameter.
Type
Integer
Length
Fullword

The name of a fullword field that contains the length of the Clnp control block that is being passed in the next parameter. To determine the value of Clnp_length, use the BPXYCLNP macro. (See BPXYCLNP— Map clone syscall parameters.)

Clnp
Supplied parameter.
Type
Structure.
Length
Specified by the Clnp_length parameter.

The name of a CLNP structure that is to be used to control the extent of the sharing between the calling process and the child process. See section on the CLNP control block in the usage notes for details on setting the fields of the CLNP. The BPXYCLNP macro maps the CLNP. (See BPXYCLNP— Map clone syscall parameters.)

The following table shows the flags that can be specified.

Flag Description
0 Creates a process, the same as fork.
CLONE_NEWIPC Creates the process in a new IPC namespace.
CLONE_NEWPID Creates the process in a new PID namespace. CLONE_NEWPID cannot be used with CLONE_PARENT.
CLONE_PARENT Creates the process such that the parent of the new process is the same as the calling process. When the child is terminated, the parent of the calling process is signaled. CLONE_PARENT cannot be used with CLONE_NEWPID.
Restriction: The clone syscall accepts only a signal of SIGCHLD. Any other value will result in a EINVAL errno.
Process_ID
Returned parameter.
Type
Integer.
Length
Fullword.

The name of a fullword in which the clone service places the process ID of the newly created child process, 0, or -1.

Upon successful completion, clone returns the process ID of the newly created child to the calling process.

Because the child is a duplicate, it contains the same service request to the clone service as the calling process. Execution of the child begins with this clone service returning a process ID value of zero. The child then proceeds with normal execution.

If Process_ID is returned as -1, no child process was created for the reason shown by Return_code.

Return_code
Returned parameter.
Type
Integer
Length
Fullword
The name of a fullword in which the clone service stores the return code. The clone service returns Return_code only if Process_ID is -1. For a list of return code values, see Return codes (errnos) in z/OS UNIX System Services Messages and Codes A list of possible return codes follows.
Return_code Explanation
EAGAIN The resources required to allow another process be created are not available now, or you have already reached the maximum number of processes you can run.

The following reason codes can accompany the return code:JRForkExitRcChildNoStorage, JRForkExitRcParentBadEnv, JRForkExitRcParentNoRoom, JRForkNoAccess, JRForkNoResource, JRForkVsmListTooLarge, JRKernelReady, JRMaxChild, JRMaxProc, JRMaxUIDs, JRNoSecurityProduct, JRNotKey8, and JRWlmWonErr.

EINVAL
One of the input parameters was not valid.
  • The identifier, version, and length values, signal, or clone_flags provided in the CLNP are incorrect.
  • A CLONE_PARENT request was made from the init process or a namespace init process.
The following reason code can accompany the return code: JRJsrRacXtr, JRCLNPNotValid, JRUnsupportedFlag, JRUnsupportedSignal, JRMutuallyExclFlag, JrCalledFromInitProc.
ENOMEM The process requires more space than is available.

The following reason codes can accompany the return code: JrNSInitProcTerm, JrNamespaceNotFound.

ENOSPC A system limit was reached.
  • The limit on the number of namespaces would be exceeded.
  • Creating a PID namespace as requested by CLONE_NEWPID would cause the nesting depth limit of PID namespaces to be exceeded.
The following reason codes can accompany the return code: JRMaxNamespace, JrMaxNamespaceNestin.
EPERM The calling process does not have appropriate privileges.
  • The user is not a superuser and is not permitted to the CONTAINERS resource in the UNIXPRIV class.
The following reason code can accompany the return code: JrNotAuthNameSp.
EMVSSAF2ERR An error occurred in the security product.

The following reason code can accompany the return code: JrSAFInternal.

Reason_code
Returned parameter.
Type
Integer.
Length
Fullword.

The name of a fullword in which the clone service stores the reason code. The clone service returns Reason_code only if Return_value is -1. Reason_code further qualifies the Return_value. For a list of reason codes, see Reason codes in z/OS UNIX System Services Messages and Codes.

Usage notes for clone

  1. The input CLNP block must be set up correctly or the service will return with EINVAL. The identifier, version, and length values must all be set. The signal must be SIGCHLD.
  2. If a UNIX set-user-ID or set-group-ID privileged program that switched the caller's effective UID or GID invokes the clone service, the child process that is created inherits the privilege of the set-user-ID or set-group-ID program.
  3. The new process (the child process) is a duplicate of the process that calls the clone service (the calling process), with the following exceptions:
    • The child process has a unique process ID (PID) in its namespace and each of any ancestor namespaces that does not match any active process group ID.
    • The child is created in the same namespaces, unless one or more of the CLONE_NEWxxx fields in the CLNP are set or a prior UNSHARE CLONE_NEWPID or SETNS CLONE_NEWPID was issued by the calling process.
    • The child has a different parent process ID (namely, the process ID of the process that called the clone service) unless CLONE_PARENT was specified. If the new process is created in a PID namespace other than the PID namespace of the caller, the child appears to have no parent process (PPID=0) from its view within the namespace.
    • The child has its own copy of the calling process's file descriptors. Each file descriptor in the child refers to the same open file as the corresponding file descriptor in the calling process.
    • If the file has its FCTLCLOFORK flag set on, it is not inherited by the child process. This flag is set with the fcntl service. For more information, see fcntl (BPX1FCT, BPX4FCT) — Control open file descriptors.
    • The child has its own copy of the calling process's open directory streams. Each open directory stream in the child can share directory stream positioning with the corresponding directory stream of the calling process.
    • The process and system utilization times for the child are set to zero.
    • Any file locks that were previously set by the calling process are not inherited by the child.
    • The child process has no interval timers set (similar to the results of a call to the alarm service with Wait_time specified as zero).
    • The child has no pending signals.

    In other respects, for z/OS UNIX the child is identical to the calling process.

  4. When the clone requests the process to be added in one or more new namespaces (one of the CLONE_NEWxxx flags was specified) the caller must be authorized by being a superuser or having at least READ access to the CONTAINERS resource in the UNIXPRIV class.
  5. The child process inherits all key 8 shared memory segments that are attached to the calling process. The internal values of the number of processes that are attached to each shared memory segment (shm_nattch) are incremented.

    Because BPX1CLN only supports the propagation of key 8 storage, the clone service does not propagate to the child any shared memory segments that reside in a storage key other than key 8.

  6. If the calling address space uses the macro IARVSERV to capture storage, the pages are not copied to the child address space.
  7. The semaphore adjustment values (semadj) are cleared in the child process.
  8. PSW Key 2 mmap storage areas are not propagated to the child. Above the bar key 2 and key 8 mmap storage areas are propagated to the child.
  9. For AMODE 64 callers, high-memory storage is copied to the child process in the following cases:
    • All storage that is obtained by an IARV64 request that was made by the cloning thread is copied to the child process.
    • All storage that is obtained by an IARV64 request with a user token that contains zeros in bits 0-31 and the calling process's PID in bits 32-64 is copied to the calling process. In the child process, the user token is changed to the value of the child process's PID in bits 32-64.
    • All storage that is obtained by an IARV64 request with a user token that contains zeros in bits 0-31 and a nonzero value that matches ThliParentTkn in bits 32-64 (when ThliChildTkn is nonzero) is copied to the child process. In the child process, the user token is changed to the value of ThliChildTkn from the calling process. This value is also used to initialize ThliParentTkn on the child process.
    • All authorized storage that is obtained by an IARV64 request with a user token that contains zeros in bits 32-64 and the calling process's PID in bits 0-31 is copied to the calling process. In the calling process, the user token is changed to the value of the child process's PID in bits 0-31.
    • All authorized storage that is obtained by an IARV64 request with a user token that contains zeros in bits 32-64 and the value of PSALAA in bits 0-31 is copied to the child process. In the child process, the user token is changed to the value of the child process's LAA in bits 0-31.
  10. The child process inherits the MEMLIMIT of the calling process.

The child address space inherits the following address space attributes of the calling process address space: region size and time limit.

Related services

Characteristics and restrictions

Following is a list of characteristics or restrictions for the clone service:
  • The clone service can be requested from either an MVS or kernel address space.
  • The clone service is supported from programs that are running in PSW key 8 only. An additional requirement is that the storage protection key value in the TCBPKF field of the task control block (TCB) must be 8. The clone service from authorized or problem-state programs with a PSW key other than 8 or a TCBPKF value other than 8 is rejected with an error code.
  • A namespace init process is the first process to run within a PID namespace and is always assigned PID 1 within the namespace. A namespace init process is created as a result of a clone service request with CLONE_NEWPID specified or the first child process created after an unshare service with CLONE_NEWPID specified. When a namespace init process terminates, all processes in the PID namespace are signaled to terminate and no new processes can be created in the namespace.

    Once the first child process is created after an unshare for a PID namespace, subsequent child processes created by the calling process are created in the new namespace. Similarly, a prior setns service for a PID namespace will cause subsequent children of the calling process to be created in a specific namespace. If child processes are to be created in a specific PID namespace whose init process has terminated, the clone service will fail with ENOMEM.

  • The CLONE_PARENT flag cannot be specified when invoked from a namespace init process. Doing so will result in a EINVAL error.
  • Only the following storage subpools are copied by clone: 0-127, 129-132, and 251-252.
  • Except for subpool 252, which is all key-0 storage, only the caller's key-8 storage is copied to the child. For subpools that support multiple keys (that is, subpool 129 to subpool 132) only storage that is obtained with a key of 8 is copied.
  • When the clone service is called from a single-process address space, all storage that was obtained by all the tasks in the calling job step in the given subpools are copied to the child address space.

    When the clone service is called from a multiple-process address space, only storage that is obtained by the tasks in the calling process in the previously identified subpools is copied to the child address space.

  • The child process always runs in problem program state key of 8, even when it is cloned by an APF-authorized MVS process.
  • One task (thread) and one request block (RB) are present in the child address space after the clone service request. If the calling process was single-task with multiple RBs, only a single RB is created in the child address space after the clone service request. If multiple tasks exist in the calling process, only the task issuing the clone service request is replicated. Serialization does not occur among the different tasks.
  • The TCB address and the addresses of other MVS control blocks are likely to be different in the child.
  • The clone service does not copy any system subpools or MVS control blocks from the calling process to the child, except as noted.

    For example, the task I/O table (TIOT) is not copied. MVS data sets that were allocated in the calling process are not allocated to the child except for the propagated TASKLIB, STEPLIB, or JOBLIB DD data sets. Because user data in user subpools is copied, some of those control blocks might point to system control blocks that are no longer present in the child.

    As another example, a user's data control block (DCB) that was opened in the calling process still appears as an opened DCB in the child. However, the corresponding system control blocks pointed to by the DCB are not present in the child.

    Only services that are documented as supported can be used across the clone service.

  • There is a limit on the total number of living or zombied children the calling process can have at a time. This limit is set with the MAXPROCUSER parameter in a BPXPRMxx parmlib member. You can retrieve this count with the sysconf service (BPX1SYC, BPX4SYNC).
  • There is a limit on the maximum number of namespaces (all types combined). It is set to one half of the maximum process limit. It is a static limit that is not affected by any changes that are made to MAXPROCSYS.
  • There is a limit on the maximum number of namespaces in the system (all types combined). It is set to one half of the maximum process limit. It is a static limit that is not affected by any changes made to MAXPROCSYS.
  • There is a limit on the number of processes allowed in a PID namespace (including processes in descendant namespaces as they are visible), set to one half the MAXPROCSYS when the namespace was created.
  • PID namespaces can be nested, thus forming a hierarchical tree. There is a limit to the nesting depth of PID namespaces set to 4 namespaces levels underneath the root namespace.
Although the child process resembles the calling process in many ways, it has specific differences from the calling process. In addition to the differences described in POSIX.1 (under fork), the following are some examples of elements in the calling process that are not propagated to the child process:
Linkage stack
The caller can have a linkage stack, but the child does not inherit it. If the caller intends to do an exec service request in the child, the loss of the linkage stack is not a problem. It is a problem only if the child process executes a PR (Program Return) instruction that requires the linkage stack.
Access list (that is, PASN-AL, DU-AL)
The calling process's access lists are not propagated to the child.
Access registers
Access registers are not propagated to the child because the child process does not inherit the calling process's access list, which would be needed to use the access registers.
Virtual pages
Virtual pages that were page-fixed in the calling process are not page-fixed in the child.
Dynamic resource managers (RESMGRs)
Dynamic resource managers that were established for the calling process are not propagated to the child.
MVS files
Any MVS files that were opened for the calling process are not opened for the child process, except for the TASKLIB, STEPLIB, or JOBLIB DD data sets that were propagated from the calling process. Only z/OS UNIX files are opened in the child process.
Mutexes and condition variables
Because ownership of mutexes and condition variables is on a single-thread basis, these attributes cannot be propagated on clone. Where a mutex or condition variables exists, the thread that is created in the child has access to the shared memory and can use the mutex or condition variable. However, when it begins running, it will not own any mutexes or consume any condition variables.

Examples

For examples that use the clone callable service, see BPX1CLN (clone) example and BPX4CLN (clone) example.

MVS-related information

  1. Following is a list of services in the child that relate to the services done in the process that is being cloned.
    GETMAIN, FREEMAN, or STORAGE
    If the calling process has issued a GETMAIN macro for a storage block, the child process can issue a FREEMAIN macro for the same storage block.
    LOAD or DELETE
    If a problem state-calling process issues a LOAD macro for a module, the child process can issue a DELETE macro to remove the module from storage. If the child process issues a LOAD macro for the same module that was loaded in the calling process, the copied version of the module is used and the use count is incremented. If a supervisor state-calling process issues a LOAD macro for a module, the child process cannot issue a DELETE macro for the module. It also cannot use a LOAD macro to load a new copy of the module. However, a LOAD macro for global storage is not reflected in the child; the child cannot issue a DELETE macro to remove a module that was loaded to a common storage by the calling process.
    CSVQUERY
    The EPTOKEN (entry point token) returned as OUTEPTKN on a CSVQUERY macro in the calling process can be used by the child as the INEPTKN parameter on a CSVQUERY macro to refer to the same module.
    ESTAE
    The child process can issue an ESTAE macro with a 0 parameter to delete an ESTAE routine that was established by the calling process.
    ESPIE
    The child process can delete an ESPIE routine that was established by the calling process. No other MVS services are carried across clone. They can be freely used in either the calling process or the child process. However, the result of these services (if performed in the calling process) cannot be available to the child process.
  2. The system propagates the contents directory-related information (including extent lists) for the job pack queue for the job step task that is related to the task issuing the clone call . It also propagates the information on all modules (whether private or in the LPA) that have been loaded by the task issuing the clone call.
  3. The system propagates the current task's SPIE or ESPIE and STAE or ESTAE status to the child process.
    • STAE or ESTAE control blocks that represent the current RB are propagated to the child process. Control blocks that are associated with older RBs are not propagated, nor are STAI or ESTAI control blocks.
    • SPIE or ESPIE control blocks that represent the current RB are propagated to the child process. SPIE or ESPIE control blocks that are associated with older RBs are not propagated.
  4. Security information from the calling process's address space is propagated to the child's address space. As a result, the child has a security environment equivalent to that of the calling process.
  5. The TASKLIB, STEPLIB, or JOBLIB DD data set allocations that are active for the current task are propagated to the child's address space. This causes the child address space to have the same MVS program search order as the calling process task.
  6. The accounting information of the calling process's address space is propagated to the child's address space. (See Managing accounting work in z/OS® UNIX System Services Planning.)

    If the ThliForkAcctg bit is set on in BPXYTHLI — Thread-level information, the clone service creates the child with the accounting data from the RACF® WORKATTR of the user ID that is associated with the last setuid call. If no setuid call has been performed, the accounting information from the calling process is used. No error is returned to the caller.

  7. The job name of the calling process is propagated to the child and appended with a numeric value in the range of 1-9 if the job name is 7 characters or fewer. If the job name is 8 characters, the job name is propagated as is. When a job name is appended with a numeric value, the count wraps back to 1 when it exceeds 9.
  8. If the calling process task is in a workload management (WLM) enclave, the child is joined to the same WLM enclave. This allows WLM to manage the calling process and child as one business unit of work entity for system accounting and management purposes.
  9. z/OS UNIX sets a default message class of A for all forked, cloned, or spawned processes. Unlike JES, z/OS UNIX does not have a method for accepting a user-supplied default message class, and a default had to be supplied to the converter interpreter. Message class A was chosen as the default for BPXAS initiators. You cannot dynamically change this default value. The MSGCLASS for the job log (JESMSGLG, JESJCL, JESYSMSG) is set to class A before the fork or spawn that associates the process with the BPXAS initiator is begun.
  10. The user syscall trace setting is propagated to the child process.