awsprov_templates.json
The awsprov_templates.json file defines the mapping between LSF resource demand requests and AWS instances.
The template represents a set of hosts that share some attributes, such as the number of CPUs, the amount of available memory, the installed software stack, operating system.
LSF requests resources from the resource connector by specifying the number of instances of a particular template that it requires to satisfy its demand. The resource connector uses the definitions in this file to map this demand into a set of allocation requests in AWS.
The default location for the file is <LSF_TOP>/conf/resource_connector/aws/conf/awsprov_templates.json.
Description
LSF requests resources from the resource connector by specifying the number of instances of a particular template that it requires to satisfy its demand. The resource connector uses the definitions in this file to map this demand into a set of allocation requests in AWS.
Parameters
- templateId
- The unique template name. The templateId cannot contain underscores (_).
- maxNumber
- That maximum number of instances to provide. Set the
MaxNumber to an appropriate value according to the instance quota of the
LSF project.
As of Fix Pack 14, to support AWS EC2 Fleet templates, the MaxNumber is a multiplier of the ncpu value, not a direct number of instances. For EC2 Fleet, maxNumber multiplied by ncpus is the maximum slots that EC2 Fleet template can get and can be provisioned in this template. For example, if the maxNumber is
5
andncpus
is2
, then the maximum slots for the fleet request will be10
. - attributes
- A list of attributes that represent the hosts in the template from the
LSF point of view. LSF
attempts to place its pending workload on hosts that match these attributes to calculate how many
instances of each template to request.
You can define any arbitrary string resource in the lsf.shared file and use that as an attribute in the awsprov_templates.json file. You can then use that attribute in a bsub select string (for example, bsub -R "select[zone == us_east_2a"]). If zone == us_east_2a is selected at job submission, hosts are created from the template that defines the zone attribute to us_east_2a.
To submit a job with a specific template name or a template string attribute, you must define that string resource in the lsf.shared and in the user_data.sh script for that resource to be added to the lsf.conf file on the server host that is created from the template.
The user_data.sh script is located in the <LSF_TOP>/<LSF_VERSION>/resource_connector/aws/scripts directory.
Each attribute string in the list has the following format:"attribute_name": ["attribute_type", "attribute_value"]
- attribute_name
- An LSF
resource name, for example, type or ncores.
The attribute name must either be a built-in resource (such r15s or type), or defined in the Resource section in the lsf.shared file on the LSF management host.
- attribute_type
- Can be either Boolean, String, or Numeric and must correspond to the corresponding resource definition in the lsf.shared file.
- attribute_value
- The value of the resource that is provided by hosts. For Boolean resources, use 1 to define the presence of the resource and 0 to define its absence. For Numeric resources, specify a range that uses [min:max].
Depending on your cloud provider, various attributes are supported in the template.
The following attributes have default values if they are not defined:- type
- The default value is given by the setting of the LSB_RC_DEFAULT_HOST_TYPE in the lsf.conf file. The default value of LSB_RC_DEFAULT_HOST_TYPE is X86_64.
- ncpus
- Default value is 1.
Take note of these attributes:- gpuextend
- Optional. A string that represents the GPU topology on the template
host.
This attribute value is in the following format:
"key1=value1;key2=value2;..."
The following keys are supported in this attribute:- ngpus
- Total number of GPUs. This must be defined either as a key in gpuextend or defined as a separate attribute. If it is defined in both places, the key value in gpuextend takes precedence.
- nnumas
- Total number of NUMA nodes. The default value is 1.
- gbrand
- The GPU brand. This value is case sensitive, and supports NVIDIA GPUs. For a list of GPU brands
and models, run the nvidia-sml -L command.
For example, for Tesla K80, the GPU brand is
Tesla
. - gmodel
- The GPU model. This value is case sensitive, and supports NVIDIA GPUs. For a list of GPU brands
and models, run the nvidia-sml -L command.
For example, for Tesla K80, the GPU model is
K80
. - gmem
- The total GPU memory, in MB.
- nvlink
- Specifies whether the GPU supports NVLink. Valid values (case insensitive) are y, n, yes, no.
- imageId
- The ID of the Amazon Machine Image (AMI) that has LSF preinstalled on it. This AMI is used to launch virtual instances.
- subnetId
- The subnet name (virtual private cloud) used to launch virtual instances.
Use the subnet through which the instance can communicate with the LSF
cluster. For AWS spot instances only, you can specify more than one subnet, separated by commas. For example:
More than one subnet are not supported for on-demand AWS instances."subnetId": "subnet-bc219af5, subnet-ac819ch2"
- vmType
- The machine type of the AWS instance you want to create. The
vmType that is configured in each template must correctly represent the
template attributes presented to LSF from
AWS.For AWS Spot instances only, you can specify multiple machine types, separated by commas. For example:
Multiple machine types are not supported for on-demand AWS instances."vmType": "c4.large, m4.large"
- launchTemplateId
- Optional. The ID of the launch template. Specify a string between 1 and 255 characters in length.
- launchTemplateVersion
- Optional. The version number of the launch template to select when
launching instances. Specify the version number of the launch template or one of the following keywords:
- $Latest
- Amazon EC2 Auto Scaling selects the latest version of the launch template when launching instances.
- $Default
- Amazon EC2 Auto Scaling selects the default version of the launch template when launching instances. This is the default value of the launchTemplateVersion attribute.
- fleetRole
- For Spot Instance templates. Specifies the role that grants the permission to bid on, launch, and terminate spot fleet instances on behalf of the user.
- spotPrice
- For Spot Instance templates. Specifies the bid price for the instance. The
Spot instance is launched when the Spot price of the instance is below the bid specified in the
spotPrice attribute.
The spotPrice attribute is used in determining if the request is a spot request or an on-demand request. If you set the spotPrice attribute with a positive number, the AWS plug-in considers this request as a spot request. If the attributes fleetRole or allocationStrategy are defined, but the spotPrice is not defined, the request is considered an on-demand request.
If an on-demand request is initiated using a template with multiple vmType or subnetId values, the request fails.
- allocationStrategy
- Optional. For spot instance templates. The allocation strategy for your
spot fleet determines how it fulfills your Spot fleet request from the possible spot instance pools
that are represented by its launch specifications. You can specify the following allocation
strategies in your spot fleet request:
- CapacityOptimized
- The Spot instances come from the pools with optimal capacity for the number of instances that are launching. This is the default strategy.
- LowestPrice
- The Spot instances come from the pool with the lowest price.
- Diversified
- The Spot instances are distributed across all pools.
- keyName
- Optional. The name of the key-pair file that is used by
ssh to log in the launched instance. If no value is specified then the instance
is launched with no key.
If the proper permission is not available, then the value is ignored and the AWS log will be informed.
- interfaceType
- Optional. The type of network interface to attach to the
instance.
Specify efa to attach an Elastic Fabric Adapter (EFA) interface to an instance. You can only specify an EFA network interface for supported AMI or instance types. For more details on supported AMI or instance types for EFA interfaces, refer to the Amazon Web Services website (https://aws.amazon.com/).
Note: If you defined efa in the AWS launch template, you cannot remove or unset the efa interface value by using this AWS launch templateThe default value is interface, which specifies that a non-EFA network interface is attached to the template.
- securityGroupIds
- Optional. A list of strings for AWS security groups that are applied to instances. If you don't specify securityGroupIds, AWS uses the default group.
- instanceProfile
-
Specifies an AWS IAM instance profile to assign to the requested instance. Jobs running in that
instance can use the instance profile credentials to access other AWS resources. The instance profile can be specified by one of the following methods:
- Short name; for example, MyProfile.
Valid characters for the instance profile name are uppercase and lowercase alphanumeric characters and any of the following ASCII characters: equal sign (=), comma (,), period (.), at sign (@), minus sign (-).
- AWS Amazon Resource Name (ARN); for example, arn:aws:iam::<account
number>:instance-profile/LSFRole.
The colon character (:) cannot appear in the short name or path. The string arn: at the beginning of the profile reference determines whether the reference is an ARN or a short name. Note: In this context “IAM Role” is essentially equivalent to “Instance Profile”.
- Short name; for example, MyProfile.
- instanceTags
- Optional. A string that represents a list of keys and their values. These
key-value pairs are used to tag the instance, by using Amazon instance tagging feature. If an
instance is launched that uses TemplateA, it is tagged with value of the
instanceTags attribute defined in TemplateA.If instanceTags is not specified, LSF still tags the newly launched instances with the following key-value pair:
InstanceID = <ID of the instance created>
The instanceTags attribute also tags EBS volumes with the same tag as the instance. EBS volumes are persistent block storage volumes used with an EC2 instance. EBS volumes are expensive, so you can use the instance ID that tags the volumes for the accounting purposes.Note: The tags cannot start with the string aws:. This prefix is reserved for internal AWS tags. AWS gives an error if an instance or EBS volume is tagged with a keyword starting with aws:. Resource connector removes and ignores user-defined tags that start with aws:. - ebsOptimized
-
An Amazon EBS-optimized instance provides additional, dedicated capacity for Amazon EBS input and output. This optimization improves performance for your EBS volumes by minimizing contention between Amazon EBS input and output and other traffic from your instance.
See the AWS documentation for more information about Amazon EBS-optimized I\instances.
Use the ebsOptimized attribute in your AWS template to create instances with Amazon EBS optimization enabled.
Valid values are Boolean true and false. The default is false. You must specify the proper vmType that supports EBS optimization.
The EBS optimization service is expensive and only available on high-end instance types. If the instance type does not support the attribute, an error messages is issued. Resource connector suspends the provider for 10 minutes. You can change the vmType in the template and restart ebrokerd.
- priority
- By default, LSF sorts candidate template hosts by template name. However, an administrator might want to sort them by priority, so LSF favors one template to the other. The priority attribute has been added. LSF will use higher priority templates first (for example, less expensive templates should be assigned higher priorities).
- placementGroupName
- Optional. The name of the placement group that the instances are launched
to. The group must exist on your AWS account. Successfully launching the instances into a placement
group has the following requirements:
- A placement group can't span multiple availability zones.
- The name that you specify for a placement group must be unique within your AWS account.
- The instance type that is defined in the template must be supported by the placement group created.
- Terminate all the instances in the placement group before the placement group is deleted.
- tenancy
- Requires placementGroupName in the template. The values for tenancy can be
default
,dedicated
, andhost
. However, LSF currently only supportsdefault
anddedicated
. - userData
- Optional. A string that represents a list of keys and their values. The
string has the following format:
<key1>=<value1>;<key2>=<value2>; ...
- key
- The key name of the userData, such as "packages, volume, zone, or templateName.
- value
- A comma-separated list of userData values, for example, package1, package2.
Each key is converted to uppercase by the resource connector and exported as an environment variable with the specified value inside the instance (and is accessible by the user script). After userData is defined, it is divided into keys and values and exported to the instance's environment variables.
For example, if the userData parameter is defined as packages=M,N;logfile=X, the following environment variables are exported inside the instance at start time:PACKAGES=M,N LOGFILE=X
These variables can be read by the user_data.sh script in the instance as the keys PACKAGES and ZONE.
- ec2FleetConfig
- Required for AWS EC2 Fleet instances (offered as of Fix Pack 14). An absolute or relative path to the EC2 Fleet configuration file (for example, to a ec2-fleet-config.json file). For relative path, the path must be relative to LSF_TOP/conf/resource_connector/aws/conf directory.
- onDemandTargetCapacityRatio
- Optional for AWS EC2 Fleet instances (offered as of Fix Pack 14). Defines how on-demand and spot instances are
distributed among the
TotalTargetCapacity
in each EC2 Fleet request.Specify a value that is a positive float number between 0.0 and 1.0. The value represents the ratio between OnDemandTargetCapacity to TotalTargetCapacity. To request pure on-demand or pure spot instances, you can set this ratio to 1 or 0. If not defined, it follows the DefaultTargetCapacityType in the ec2FleetConfig file.
Example overall awsprov_templates.json file
{
"Templates":
[
{
"templateId": "TemplateA",
"attributes":
{
"type": ["String", "X86_64"],
"ncpus": ["Numeric", "4"],
"mem": ["Numeric", "480"],
"maxmem": ["Numeric", "512"],
"awshost": ["Boolean", "1"],
"zone": ["String", "us_east_2a"]
"pricing": ["String", "ondemand"],
"computeUnit": ["String", "encl_3"]
},
"imageId": "ami-27b1",
"subnetId": "subnet-b5738",
"vmType": "t2.micro",
"maxNumber": "1",
"keyName": "LSF_Key",
"securityGroupIds": ["sg-72314"],
"placementGroupName": "lsfgrp1",
"instanceTags": "group=LSF;project=Amazon",
"userData": "pricing=ondemand;zone=us_west_2b"
}
]
}
The example defines a template that is named TemplateA. LSF
attempts to place any pending workload on hypothetical hosts of type X86_64
with
ncpus=4 and mem>480 MB. If LSF
successfully places some of its pending workload on N number of hosts, it requests N
instances of TemplateA to the resource connector.
If demand is generated for this template, the connector logic attempts to allocate N hosts with the configured image and vmType (instance type) in AWS. If it succeeds to obtain any instances, even if there are fewer than requested, the resource connector informs LSF that it can use the instances.
In this example, the template also defines the awshost resource. You can make sure that your jobs generate demand for AWS resources by using 'select[awshost]' in your LSF job submission resource requirement strings.
The zone attribute is an example string resource that is defined in the lsf.shared file. If the zone attribute is specified, an instance is created in the specified zone.
The user script scripts/user_data.sh is included in the instance and run during instance startup.
#!/bin/bash
LSF_TOP=/usr/share/lsf
LSF_CONF_FILE=$LSF_TOP/conf/lsf.conf
# run user script to enable selecting template based on zone
%EXPORT_USER_DATA%
logfile=/tmp/userscript.log
env > $logfile
if [ -n "${zone}" ]; then
sed -i "s/\(LSF_LOCAL_RESOURCES=.*\)\"/\1 [resource ${zone}]
[resourcemap ${zone}*zone]\"/" $LSF_CONF_FILE
echo "update LSF_LOCAL_RESOURCES lsf.conf successfully,
add [resource ${zone}] [resourcemap ${zone}*zone]" >> $logfile
else
echo "zone doesn't exist in environnment variable" >> $logfile
fi
Example awsprov_templates.json file for on-demand instances
{
"templates": [
{
"templateId": "templateA",
"attributes": {
"type": ["String", "X86_64"],
"ncores": ["Numeric", "1"],
"ncpus": ["Numeric", "1"],
"nram": ["Numeric", "512"],
"awshost1": ["Boolean", "1"],
"zone": ["String", "us_west_2a"],
"pricing": ["String", "ondemand"]
},
"imageId": "ami-8914cbe9",
"subnetId": "subnet-cc0248ba",
"vmType": "t2.nano",
"keyName": "martin",
"securityGroupIds": ["sg-b35182ca"],
"instanceTags": "Name=aws1-vm-1-from-cluster-aws1",
"userData": "zone=us_west_2a;pricing=ondemand"
}
]
}
Example awsprov_templates.json file for spot instances
{
"templates": [
{
"templateId": "templateB",
"attributes": {
"type": ["String", "X86_64"],
"ncores": ["Numeric", "1"],
"ncpus": ["Numeric", "1"],
"nram": ["Numeric", "512"],
"awshost1": ["Boolean", "1"],
"zone": ["String", "us_west_2b"],
"pricing": ["String", "spot"]
},
"imageId": "ami-8914cbe9",
"subnetId": "subnet-7c0dfb27,subnet-12286475,subnet-cc0248ba",
"keyName": "martin",
"vmType": "c4.xlarge, m4.large",
"fleetRole": "arn:aws:iam::700071821657:role/EC2-Spot-Fleet-role",
"securityGroupIds": ["sg-b35182ca"],
"spotPrice": "0.1",
"allocationStrategy":"diversified",
"instanceTags": "Name=aws1-vm-3-spot-aws1",
"userData": "zone=us_west_2b;pricing=spot"
}
]
}
Example awsprov_templates.json file for EBS-optimized instances
{
"templates": [
{
"templateId": "Template-VM-1",
"maxNumber": 4,
"attributes": {
"type": ["String", "X86_64"],
"ncores": ["Numeric", "1"],
"ncpus": ["Numeric", "1"],
"mem": ["Numeric", "1024"],
"awshost1": ["Boolean", "1"]
},
"imageId": "ami-40a8cb20",
"vmType": "m4.large",
"subnetId": "subnet-cc0248ba",
"keyName": "martin",
"securityGroupIds": ["sg-b35182ca"],
"instanceTags" : "group=project1",
"ebsOptimized" : true,
"userData": "zone=us_west_2a"
}
]
}
Example awsprov_templates.json file for Amazon EC2 Fleet instances
As of Fix Pack 14, the LSF resource connector for Amazon Web Services (AWS) uses an Amazon EC2 Fleet API to create multiple (that is, a fleet of) instances. EC2 Fleet is an AWS feature that extends the existing spot fleet, which gives you a unique ability to create fleets of EC2 instances composed of a combination of EC2 on-demand, reserved, and spot instances, by using a single API.{
"templates": [
{
"templateId": "fleet-lsf-template-1",
"maxNumber": 5,
"attributes": {
"type": ["String", "X86_64"],
"ncores": ["Numeric", "1"],
"ncpus": ["Numeric", "2"],
"mem": ["Numeric", "512"],
"awshost": ["Boolean", "1"]
},
"priority": "121",
"ec2FleetConfig": "ec2-fleet-config.json",
"onDemandTargetCapacityRatio":"0.5",
"instanceTags": "Name=fleet-lsf-template-1"
}
]
}