policy_config.json

The policy_config.json file configures custom policies for resource providers for LSF resource connector. The resource policy plug-in reads this file.

The default location for the file is
<LSF_TOP>/conf/resource_connector/policy_config.json

The policy_config.json file contains a JSON list of named policies and optimizations. Policies are rules set during the calculation of demand. Optimizations are rules set after the calculation of demand to try to get better results. Each policy contains a name, a consumer, a maximum number of instances that can be launched for the consumer, and maximum number of instances that can be launched in a specified period.

Parameters

UserDefinedScriptPath
Optional. Specify the full path to your own resource provider policy script. Your custom policy script runs after the default plug-in runs with the same input JSON file, and the demand that is calculated by your script is used. Demand that is calculated by the default plug-in is ignored. If the UserDefinedScriptPath is defined and it fails to run, the demand is 0, which means no demand.
The following example defines the path to the script userscript.py:
"UserDefinedScriptPath" : "/usr/share/lsf/10.1/scripts/userscript.py"
Policies
Optional. A list of policies that apply on the demand calculation. If the policies are not defined, the demand that is calculated by resource connector is used.
Name
Required. The name of the policy. You can define multiple policies in the list. Each policy must have a unique name.
Consumer
Optional. The following consumer attributes are supported:
rcAccount
A list of accounts that can borrow hosts through LSF resource connector. Supported values are all or any valid account name that is defined in the RC_ACCOUNT tag in the lsb.queues file. If this attribute is not defined, the default value is all.
templateName
A list of template names. Supported values are all or any valid template name. If this attribute is not defined, the default value is all.
provider
A list of resource provider names. Supported values are all or any valid provider name. If this attribute is not defined, the default value is all.wh
perRcAccount
Used with the MaxNumber parameter in the Policies parameter of this policy_config.json file to define the maximum number of instances per resource connector account. Specify a list of accounts that can borrow hosts through LSF resource connector. The value cannot be set to all.

If the perRcAccount value is not defined, the default value will be default.

perTemplateName
Used with the MaxNumber parameter in the Policies parameter of this policy_config.json file to define the maximum number of instances per resource connector template. Specify a list of template names. Supported values are any valid template names. The value cannot be set to all.
perProvider
Used with the MaxNumber parameter in the Policies parameter of this policy_config.json file to define the maximum number of instances per resource provider. Specify a list of resource provider names. Supported values are any valid provider names. The value cannot be set to all.
If a consumer is not defined, the following attributes apply to all providers, templates, accounts defined in the cluster:
MaxNumber
Optional. The maximum number of instances a user can create or launch for the consumer.

When specifying the MaxNumber parameter in the Policies parameter, you can also specify values for the perRcAccount, perTemplateName, and perProvider consumer attributes. Additionally, for the perRcAccount attribute, the value cannot be set to all; if the perRcAccount is not defined, the default value will be default.

StepValue
Optional. The StepValue parameter has two values, which are separated by a colon (:). The step index is the maximum number of instances that can be launched at a time for the defined consumer. The step time controls how fast the cluster grows. The step time specifies how long the plug-in waits before it launches another set of instances that are specified by the step value. If the consumer is not defined, the parameter applies cluster wide.

For example, if step value is defined as 5 and step time is defined as 10 ("StepValue": "5:10") and a request comes in for 20 instances, 5 instances are launched in the first 10 minutes, 5 more in next 10 minutes until the demand is met or the maximum number instances that are specified by the MaxNumber parameter are launched.

The default for step index to launch all the instances at the same time.

Default Value for step time is 10 minutes. The default value that is applied only if a step value is defined but a step time is not defined.

Optimizations
Optional. Rules set after the calculation of demand to try to get better results. Optimizations to apply to the provisioning results.
allocRules
Optional. A list of allocation rule entries that specify how many hosts from a certain template are worth considering over another template. For every allocation rule entry, each of the following allocation rule attributes are mandatory:
fromTemplate
The following fromTemplate attributes are supported:
provider
Specifies a resource provider name. Supported value is any valid provider name.
templateName
Specifies a template name. The value must be a valid templateName under the provider.
factor
Number of hosts that are being replaced.
toTemplate
The following toTemplate attributes are supported:
provider
Specifies a resource provider name. Supported value is any valid provider name.
templateName
Specifies a template name. The value must be a valid templateName under the provider.
factor
Number of hosts to be replaced.

For any attribute that is not defined or any errors in a configuration, the allocation rule entry is ignored and the next entry is evaluated.

Tip: Configuring this allocRules parameter, compliments configuring the RC_DEMAND_POLICY parameter in the lsf.queues file. The RC_DEMAND_POLICY parameter enables LSF to gather more pending jobs before provisioning. As a result, the jobs can be optimized with more information, but result in delayed run time. For more information about the RC_DEMAND_POLICY parameter, see RC_DEMAND_POLICY topic.
Consider the following optimizations configuration example:
"Optimizations" : {
    "allocRules" : [
        {
            "fromTemplate": {
                "provider" : "aws",
                "templateName" : "aws_template1",
                "factor" : 4
            },
            "toTemplate" : {
                "provider" : "aws",
                "templateName": "aws_template3",
                "factor" : 1
            }
        },
        {
            "fromTemplate": {
                "provider" : "aws",
                "templateName" : "aws_template1",
                "factor" : 2
            },
            "toTemplate" : {
                "provider" : "aws",
                "templateName": "aws_template2",
                "factor" : 1
            }
        }
    ]
  }
The following command displays the policies and optimizations configured: badmin rc view -c policies
Here is the display output:
Optimizations
        4 hosts (aws:aws_template1) replaced by 1 hosts (aws:aws_template3)
        2 hosts (aws:aws_template1) replaced by 1 hosts (aws:aws_template2)
For this example, to ensure the rules works for most cases, a proper template priority needs to be set. In this example configuration, aws_template1 is a replacement template so it needs to have a highest template priority than other templates. The aws_template3 is the first rule to consider for the replacement, so the template priority needs to be lower than aws_template1, but higher than aws_template2. Therefore, the template priority can have the following settings:
Template name Priority
aws_template1 10
aws_template3 9
aws_template2 8

Example


{
  "UserDefinedScriptPath" : "/usr/share/lsf/10.1/scripts/userscript.py",
  "Policies":
  [
    {
      "Name": "Policy1",
      "Consumer":
      {
        "rcAccount": ["all"],
        "templateName": ["all"],
        "provider": ["all"]
      },
      "MaxNumber": "100",
      "StepValue": "5:10"
    },
    {
      "Name": "Policy2",
      "Consumer":
       {
        "rcAccount": ["default", "project1"],
        "templateName": ["aws_template1"],
        "provider": ["aws"]
       },
      "MaxNumber": "50",
      "StepValue": "5:20"
    },
    {
      "Name": "Policy3",
      "Consumer":
      {
        "perRcAccount": ["project1","project2"],
        "perTemplateName": ["ibm_template1","ibm_template2"],
        "perProvider": ["ibmcloudhpc"]
      },
      "MaxNumber": "100",
      "StepValue": "5:10"
    }
  ],
"Optimizations" : {
    "allocRules" : [
        {
            "fromTemplate": {
                "provider" : "aws",
                "templateName" : "aws_template1",
                "factor" : 4
            },
            "toTemplate" : {
                "provider" : "aws",
                "templateName": "aws_template3",
                "factor" : 1
            }
        },
        {
            "fromTemplate": {
                "provider" : "aws",
                "templateName" : "aws_template1",
                "factor" : 2
            },
            "toTemplate" : {
                "provider" : "aws",
                "templateName": "aws_template2",
                "factor" : 1
            }
        }
    ]
  }
}
To view the policies and optimizations that are configured for your resource connector policy (policy_config.json) file, run the command:
badmin rc view -c policies
An example of the output:
Policies
    Name: Policy1
        Consumer
            rcAccount: ["all"]
            templateName: ["all"]
            provider: ["all"]
        MaxNumber: 100
        StepValue: 5:10
    Name: Policy2
        Consumer
            rcAccount: ["default", "project1"]
            templateName: ["aws_template1"]
            provider: ["aws"]
        MaxNumber: 50
        StepValue: 5:20
    Name: Policy3
        Consumer
            perRcAccount: ["project1", "project2"]
            perTemplateName: ["ibm_template1", "ibm_template2"]
            provider: ["ibmcloudhpc"]
        MaxNumber: 50
        StepValue: 5:20
Optimizations
        4 hosts (aws:aws_template1) replaced by 1 hosts (aws:aws_template3)
        2 hosts (aws:aws_template1) replaced by 1 hosts (aws:aws_template2)
In addition, consider the RC_DEMAND_POLICY parameter in the lsf.queues file contained the following example configuration:
RC_DEMAND_POLICY = THRESHOLD[[2,10] [4,5]]
This configuration sets optimization to first apply four hosts with aws_template1 VMs with one host with aws_template3 VM. Then, considers applying two hosts with aws_template1 VMs with one hosts aws_template2 VM. The RC_DEMAND_POLICY defined the buffer time for four jobs is shorter than two jobs, so that when the demand trigger by four or more jobs, then the first optimization rule is applied; otherwise, the second optimization rule is applied.