Use AWS spot instances
Use spot instances to bid on spare Amazon EC2 computing capacity. Since spot instances are often available at a discount compared to the pricing of On-Demand instances, you can significantly reduce the cost of running your applications, grow your application’s compute capacity and throughput for the same budget, and enable new types of cloud computing applications.
With spot instances you can reduce your operating costs by up to 50-90%, compared to on-demand instances. Since spot instances typically cost 50-90% less, you can increase your compute capacity by 2-10 times within the same budget.
Spot instances are supported on any Linux x86 system that is supported by LSF.
Spot Instances have some restrictions, including instance types and fleet limitations. For more information, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-limits.html
Requesting Spot instances
bsub -R “awshost && pricing==spot” myjob
Begin Resource
RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION
...
pricing String () () (Pricing option: spot/ondemand)
...
End Resource
Spot instances are reclaimed when the spot price goes higher than the current bid price.
awsprov_templates.json:
{
"templateId": "aws-spotvm-demo",
"maxNumber": 2,
"attributes": {
…
"awshost": ["Boolean", "1"],
"pricing": ["String", "spot"],
},
...
...
...
"userData": "pricing=spot"
},
#!/bin/bash
echo START >> /var/log/user-data.log 2>&1
# run hostsetup
...
if [ -n "${pricing}" ]; then
sed -i "s/\(LSF_LOCAL_RESOURCES=.*\)\"/\1 [resourcemap ${pricing}*pricing]\"/" $LSF_CONF_FILE
echo "update LSF_LOCAL_RESOURCES lsf.conf successfully, add [resourcemap ${pricing}*pricing]" >> $logfile
fi
...
The user_data.sh script is located in the <LSF_TOP>/<LSF_VERSION>/resource_connector/aws/scripts directory.
Security requirements for spot instances
- You must create a spot fleet role add the corresponding Amazon Resource Name (ARN) to the awsprov_templates.json template configuration file. For steps to create a spot fleet role, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet-requests.html#spot-fleet-prerequisites
- The AWS user linked to the access key that is stored in the credentials file must have the Spot fleet permissions to bid on, launch, and terminate the configured Spot fleets. For steps to add permissions to a user, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet-requests.html#spot-fleet-prerequisites
Logging and troubleshooting
To increase traceability, use the TRACE log level in the LogLevel parameter in the awsprov_config.json file. This log level prints the entry of the method with the value of the parameters and the exit of the method with the return value (if exists).
Spot Fleet Request ID – Spot Instance Request Id- Spot Instance Machine ID: State update message
Limitations and known issues
- The Spot Instance Termination Notice is not accurate if the system clock is not synchronized
between the management host and the compute host. System clock synchronization is required for reclaim
to work.
The following AWS topic explains this issue: : http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-time.html
. - If a request remains pending for 60 minutes, resource connector assumes that the request is lost. The request is ignored and LSF recalculates the demand. In AWS Spot instances, the request remains pending and is not closed.
- LSF checks periodically for any hosts that are planned to be reclaimed and requeues the jobs within the 2 minute termination notice. However, it's possible that AWS might not honor the 2 minute termination notice, and machines are terminated without a termination notice. For more information, see: : http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html#spot-instance-termination-notices