Service profile reference
The service profile is an XML file that defines system service behavior. It provides the information to run services managed by the EGO service controller (EGOSC).
Environment variables
When modifying a service profile, you can configure the following environment variables with values for your system:- EGO_TOP (Windows, the top level EGO directory is Installation_top)
- EGO_BINDIR
- EGO_CLIENT_ADDR
- EGO_CONFDIR
- EGO_ESRVDIR
- EGOSC_INSTANCE_CGROUP_PATH
- EGO_LIBDIR
- EGO_LOCAL_CONFDIR
- EGO_SERVERDIR
- EGO_MASTER_LIST
- EGO_MACHINE_TYPE
- EGO_SHARED_TOP
- EGO_ESRVIR
Description: For internal system use only.
Description: The number of active service instances for a service.
Required or optional: Optional
Default value: 1
ServiceAllocationInfo
Description: For internal system use only.ServiceControlOperation
ServiceDefinition
Configures a service to be managed by the EGO service controller, defining version, resources, allocations, activities, and other configuration parameters for the service.Description: Specifies the version of the schema.
Required or optional: Optional
Description: Describes this service.
Required or optional: Optional
Valid values: Specify 1 to 120 alphanumeric and special characters, except control characters.
Description: Specifies the minimum number of instances this service requires to run.
Service instances can be started even if the number available slots do not satisfy MinInstances. A service with fewer than the number of MinInstances started will be in ALLOCATING state.
You can dynamically change this parameter in the service profile while a service is running.
Required or optional: Optional
Valid values: Specify a number greater than or equal to 1.
Default value: ""
Description: Specifies the maximum number of instances this service is allowed to run.
Specify a number greater than or equal to the value specified by MinInstances. You can dynamically change this parameter in the service profile while a service is running.
Required or optional: Optional
Valid values: Specify a number greater than or equal to 1.
Default value: ""
Description: Specifies the maximum number of instances of this service that is allowed per CPU slot. To fully utilize resources, it is recommended to configure MaxInstancesPerSlot so that it divides evenly into MaxInstancesPerHost.
This parameter is not valid for multidimensional scheduling.
Required or optional: Optional
Valid values: Specify a number greater than or equal to 1
Default value: 1
Description: Specifies the maximum number of instances of this service that is allowed per host. Specify a number greater than or equal to 1.
Limitation: No service instance can be started on a host if the ceiling specified by MaxInstancesPerHost or MaxInstancesPerSlot is more than the number of free slots on that host. It may cause slot wastage if you configure MaxInstancesPerHost so that it does not divide evenly into MaxInstances.
Required or optional: Optional
Default value: Any number of instances can start on the host
Description: Specifies whether a service will use GPUs (instead of, or in addition to, CPU slots). The value can be true or false. If true, the service uses the GPUs configured for your cluster (the cluster must include hosts that have GPUs). The default is false.
If you set the UseGPU parameter to true, you can also set ExclusiveGPU to specify whether the requested GPU mode is exclusive.
The InstanceToSlotRatio parameter specifies the number of GPUs to use for a service, in the format of 1:N, where N is the number of GPUs for each service.
Required or optional: Optional
Default value: false
Description: Set if you configured the UseGPU parameter to true, and you want to specify that the requested GPU mode is exclusive (versus default mode, which is sharable).
Required or optional: Optional
Default value: false
Description: Specifies the order in which the service controller starts services at start time. Specify a number greater than or equal to 1 that represents the relative starting order for this service. A value of 1 is the lowest priority.
Required or optional: Optional
Default value: 1
Description: Describes the service lifecycle parameters of a service. It defines parameters for starting and starting the service, and detecting hung services.
Specifies exit values, which, if received from all instances of a service on a host, should cause that host to be blocked from running this service. The EGO service controller should not restart instances on that host unless the host is unblocked.
Required or optional: Optional
Default value: No exit values are checked
Description: Specifies whether the service starts automatically or manually. Specify AUTOMATIC or MANUAL.
Required or optional: Optional
Default value: AUTOMATIC
Description: Specifies the maximum number of times this service can be restarted before it is flagged in an error state.
The counter of restart will be reset once the service keeps in RUN state longer than five minutes.
This attribute can be set in the service profile, and used with the ESC_SERVICE_START_FAIL_RETRY parameter in the egosc_conf.xml file.
Required or optional: Optional
Valid values: Specify a number greater than or equal to 0
Default value: Infinite
Description: Specifies the length of time to wait after a host fails until the service should be restarted on another host.
- nH the number of hours
- nM the number of minutes
- nS the number of seconds
The number of seconds can include decimal digits to arbitrary precision.
Description: Specifies whether there is a dependency on another service. The type attribute of the element is used to describe the dependency type, and the content of the element holds the name of the service on which it is being depended.
Required or optional: Optional
Default value: No dependency
Description: Specifies whether service dependency is checked only when the service starts or the dependency relationship is maintained during the lifecycle of the service.
Required or optional: Optional
Valid values: OnStart
Default value: OnStart
This attribute can be set in the service profile, and used with the ESC_SERVICE_START_FAIL_RETRY parameter in the egosc_conf.xml file.
Required or optional: Optional
Valid values: Integer
- Virtual IP conflicts can be solved by enabling fencing in the service profile. Use the fencing command to clean up the virtual IP on the old compute node before new service instances are started on new compute node.
- To use service instance fencing, you must also configure HostFailoverInterval. Otherwise, fencing will not be triggered while restarting the service instance manually.
Required or optional: Optional
Default value: Fencing is not enabled
Description: Specifies the duration that the service controller waits for fencing to complete before terminating the activity for the service instance.
Specify the duration in the format PTnHnMnS, where:
- nH identifies the number of hours
- nM identifies the number of minutes
- nS identifies the number of seconds (including decimal digits)
Required or optional: Optional
Default value: Infinite
Description: Describes the resource allocation information for this service.
Description: Specifies the fully-qualified name of a consumer in the consumer tree. For
example, /DeptA/ProjB/ConsumerC
indicates that
ConsumerC is part of ProjB, which is, in turn, part of DeptA.
Required or optional: Required
Description: Describes the resources required by this service.
Be aware that if you specify a resource plan created for multidimensional scheduling, the default number of slots scheduled for sharing (defined by slotmapping in MDPlan.xml) is used for the service.
Required or optional: Optional
Default value: "", which means that the service can use resources from all the resource groups associated with the consumer.
- select(expression)
- select(expression operator expression)
- select((expression operator expression) operator expression)
Required or optional: Optional
Description: Describes multidimensional scheduling parameters for the service.
Description: Specifies a multidimensional resource plan.
Required or optional: Required
Default value: No multidimensional resource plan is defined.
Description: Defines the resource metrics that can be scheduled for this service. You can use any of four predefined resource metrics or specify your own metric.
Required or optional: Required
- MetricName: Specify the name of the resource metric (predefined or
user-defined). The following predefined metrics are available:
- ncpus: Number of CPU cores that is used for multidimensional scheduling.
- maxmem: Maximum amount of RAM (in MB) that is used for multidimensional scheduling.
- maxswp: Maximum amount of virtual memory (swap space) (in MB) that is used for multidimensional scheduling.
- maxtmp: Maximum amount of space in /tmp (Linux®) or OS default temp directory (Windows).
- MetricValue: Specify the value for the resource metric.
<sc:ResourceMetric>
<sc:MetricName>ncpus</sc:MetricName>
<sc:MetricValue>1.00</sc:MetricValue>
</sc:ResourceMetric>
<sc:ResourceMetric>
<sc:MetricName>maxmem</sc:MetricName>
<sc:MetricValue>128.00</sc:MetricValue>
</sc:ResourceMetric>Description: Describes the resources required by this service.
Indicates the type of resource. The URI is intended to match the URI of the namespace of an XML schema that is used to further describe the resource.
Required or optional: Required
Description: Ties a particular ActivitySpecification to resources that match given attributes. For example, on a compute host type of resource, you might use the hostType attribute to restrict an ActivitySpecification to hosts with a hostType of
LINUXResourceSpeci86.Description: Specifies a name-value pair of any type. The name provides an identifier for
this attribute, and type indicates the XML schema type. The value of the
Attribute is in the element content, which is optional. Thus, Attribute can be
used both to provide the value of a given attribute, or can be used to express
the names and types of attributes that are supported by various objects within
the resource orchestration layer. In the service definition context, only
hostType
is a supported name.
- "all": applies to any hostType
- "NTX86": applies to 32-bit Windows
- "NTX64": applies to 64-bit Windows
- "LINUX86": applies to 32-bit x86 Linux
- "X86_64": applies to 64-bit x86 Linux
Required or optional: Optional
Default value: "all"
Description: Specifies an identifier for the attribute.
Required or optional: Optional
Description: Specifies an NCName that indicates the XML Schema type that represents this attribute. The type attribute must contain one of the names of the built-in types from the XML schema http://www.w3.org/TR/xmlschema-2/#built-in-datatypes specification.
Required or optional: Required
Description: Describes the execution parameters for an activity. The only required sub-element is the command that needs to be run to start the activity.
Required or optional: Optional
Description: A name used to refer to an Activity. Not guaranteed to be unique within a cluster.
Required or optional: Optional
Default value: ""
Description: Specifies the command to be run as part of an activity. Specify the full command line of the program to be run, including arguments.
/usr/local/bin/blastall -p blastn -d nt -i sequences
Required or optional: Required
Description: Specifies the operating system user ID to use when running an activity.
Required or optional: Optional
Default value: The execution user defined in the consumer specified by AllocationSpecification.
Description: Specifies the file creation mask used within the context of a running activity, which determines the default permissions given to files created by the activity. This is an absolute mode.
Specify a 4-digit octal number (from 0 to 7) as described in the POSIX umask(1) man page.
Required or optional: Optional
Default value: 0077
- For Windows, the workingdirectory parameter specifies the full path to the directory in which an activity executes, and not the path to the command itself. If this parameter is empty, the process will have the same current drive and directory as the calling process.
- For Linux, the workingdirectory
parameter specifies the base path for the command if the command attribute is constructed in a
relative manner. For example, if the workingdirectory parameter is
/example/scriptsand the command attribute is./script.sh, then the full command that is called is/example/scripts/script.sh.
Required or optional: Optional
Default value: ""
Description: Models an operating system environment variable to indicate which environment variables should be set within the running context of an activity.
Required or optional: Optional
Description: Specifies the name of the environment variable. Specify the name of the environment variable (for example, PATH or LD_LIBRARY_PATH), and specify any string for the content.
Description: Specifies an operating system limit within the context of a running activity.
The type
attribute is optional, and can be used to indicate whether a soft or hard limit is
being specified. If a value is not specified, the rlimit of the root of the target host is used.
- CPU
- FSIZE
- DATA
- STACK
- CORE
- RSS
- NOFILE
- MEMLOCK
- VMEM
Example: /usr/local/jboss/bin/shutdown all
If JobController is defined, the command will be started as a means to stop the activity. The resource orchestrator will forcefully kill the activity if the command failed to stop the activity within ControlWaitPeriod.
If you do not have a JobController script for cleanup and shutdown activities, specify gracefulshutdown to enable shutdown of service instances. When gracefulshutdown is specified, the container to be terminated is marked for graceful shutdown when it is started. After the grace period (specified in ControlWaitPeriod) has passed, if the instance container is still alive, SIGKILL is sent to terminate the container.
Required or optional: Optional
Default value: No JobController. The container will be killed by a signal.
- nH defines the number of hours.
- nM defines the number of minutes.
- nS defines the number of seconds. The number of seconds can include decimal digits.
Required or optional: Optional
Valid values: Greater than 0 seconds and less than 1 hour
Default value: 2 minutes
Description: Specifies common parameters for all Docker containers of a service and parameters for individual Docker containers.
Required or optional: Optional
Description: Specifies common parameters for all Docker pods or containers.
Required or optional: Required
Description: Specifies Docker network parameters.
Required or optional: Required
Description: Specifies the network type.
- bridge uses normal Docker networking.
- host uses the host's networking stack.
- external uses an external network provisioning executable. Note: If you select external network, add EGO_DOCKER_NETWORK_PLUGIN to the ego.conf file to specify the network provisioning executable.
- sdn uses multi-host networking, allowing containers to communicate with each other across Docker pods if connected to the same network. You can enable SDN on all Docker-supported operating systems for Linux (see Supported Docker versions). SDN is not supported for Linux on POWER®.
Valid values: bridge, host, external, or sdn
Required or optional: Required
Description: The specified name of the SDN network.
Required or optional: Optional
Description: The specified name for the network container in an SDN pod. As a best practice, the pod name should only be specified if you have also specified the network type as sdn. If specified, this name must be unique across the SDN network it is connected to.
Required or optional: Optional
Description: Specifies the DNS server IP address.
Example: 123.123.123.123.
Required or optional: Optional
Description: Specifies the DNS search domain.
Example: abc.com.
Required or optional: Optional
Description: Specifies an external script variable name and its value as content.
Example: myname1/myvalue1
Required or optional: Optional
Description: Specifies, in seconds, the maximal allowed duration for the overall stop operation for the containers in a pod.
Example: 30
Required or optional: Optional
Default value: 20 seconds multiplied by the number of containers in a pod plus 60 seconds.
Description: Specifies, in seconds, the maximal allowed duration of the overall start operation for the containers in a pod. PEM terminates the Docker Controller if it has not reported Docker container info to PEM during the time period.
Example: 40
Required or optional: Optional
Default value: 20 seconds multiplied by the number of containers in a pod plus 60 seconds.
Description: Specifies an environment variable name and value for the container.
Example: name: MYVARIABLENAME content: myvariablevalue
Required or optional: Optional
Description: Specifies the volume to create for the Docker container.
Required or optional: Optional
Description: Specifies the volume environment name for the Docker container.
Example: ABC
Required or optional: Optional
Description: Specifies the volume path on the host.
Example: /myHostpath/path
Required or optional: Optional
Description: Specifies the Disk IO operation type as either read-write (rw) or read-only (wo).
Example: rw
Required or optional: Optional
Description: Specifies Docker container parameters.
Required or optional: Optional
Description: A unique (to a pod definition) identifier that is used to distinguish a container within a pod.
Example: dockerContainer1
Required or optional: Required
Description: Specifies the Docker image name.
Example: dockerImage1
Required or optional: Required
Description: Specifies the path to overwrite the default ENTRYPOINT of the Docker image.
Example: /bin/bash
Required or optional: Optional
Description: Specifies the command to be run inside the container. Specify the full command line of the program to be run, including arguments.
Example: Content: dockerCommand1
Required or optional: Optional
Description: Specifies the port number to be published to the host. Optionally, specify the host port number, protocol, and the IP address to which to bind the port.
Example: PublishedPort: 177
- hostPort: 7772
- Protocol: tcp or udp. Default is tcp.
- ip: 192.111.222.233
Required or optional: Optional
Valid values: 1 to 65535
Description: Specifies the CPU shares (relative weight).
Example: 1024
Required or optional: Optional
Valid values: 0 to 1024
Description: Specifies the CPUs to allow execution.
Example: 0-10 or 1, 2,
Required or optional: Optional
Valid values: Integers greater than or equal to 0
Description: Specifies the maximum memory to use, in MB.
Example: 2048
Required or optional: Optional
Description: Specifies the path to the registry to pull the Docker image.
Example: name: myRepository1/path1
Required or optional: Optional
Description: Specifies an environment variable name and value for the container.
Example: name: MYVARIABLENAME content: myvariablevalue
Required or optional: Optional
Description: Specifies the volume to create for the Docker container.
Required or optional: Optional
Description: Specifies the container's user ID to use when running the Docker container, and this user must exist in the container.
Example: root
Description: The value can be true or false. If true, the Docker container will not be forcibly stopped until the time value (in seconds) that is set by the stop timeout parameter for ego:Docker. The default is false, which means that the container will be forcibly stopped.
Required or optional: Optional
Valid values: true or false
Description: When a job monitor is not specified, the normal termination of a short-running container will not cause the Docker pod to be shut down if any other working container is still running.
Required or optional: Optional
Valid values: true or false
Description: Specifies whether extended privileges are given to this Docker container.
Valid values: true or false
Default value: false
Required or optional: Optional
Description: Indicates a dependency on another service or Docker container. If Dependency is defined in ego:DockerContainer, the type can only be OnStart. If the type is OnStart, the service is started only if the service that is being depended on is in the STARTED state; or the Docker container is started only if the Docker container that is being depended on is started. Other types are not supported.
Example: Type: OnStart Content: dockerContainer2
Required or optional: Optional
Description: Specifies user-defined conditions for when a container is considered to be ready. If no ready conditions are specified, the container is considered to be ready immediately after starting.
Required or optional: Optional
Description: Specifies a port number that needs to be listening in the container in order for the container to be considered ready.
Required or optional: Optional
Description: Specifies a process name that must be running in the container in order for the container to be considered ready.
Required or optional: Optional
Description: Specifies a user-defined script that returns with exit code 0 if a container is ready, or with exit code different from 0 if the container is not ready.
Required or optional: Optional
Description: Specifies a string that is passes as-is to the Docker Controller, to potentially be used to apply additional Docker parameters (requires customization of Docker Controller).
Example: --option1=value1 --option2=value2
Required or optional: Optional
Description: Impersonates the execution user as a host level user when specified.
Valid values: merge or overwrite
Required or optional: Optional
Description: Additional user names that need to be impersonated as a host user.
Required or optional: Optional
Required or optional: Optional
Valid values: True or False
Default value: False
Description: Designates a user account for impersonation, enabling the service to run as the execution user specified for the consumer but using the credentials of the Impersonate user. The user's credentials are passed to the service from the EGO_IMPERSONATE_CREDENTIAL environment variable.
If you are a cluster administrator or have the Service Assign Impersonation (Any User) permission, you can assign any user account, including your own, for impersonation. If you are a consumer administrator or have the Services Assign Impersonation (Self) permission, you can only specify your own user account for impersonation.
Required or optional: Optional
Default value: Not defined
Description: Specifies the type of service.
Required or optional: Required
Valid values: Default
Description: Specifies the service name for the service profile.
Required or optional: Required
Valid values: Specify a mix of 1 to 40 alphanumeric characters and hyphens. The service name must start with a letter (that is, it cannot start with a number or a hyphen).