This post is contributed by Scott Horsfield, Principal Solutions Architect for EC2 Scalability, Surabhi Agarwal, 
Senior Product Manager for EC2 Auto Scaling, and Chad Schmutzer, Principal Developer Advocate for Amazon EC2.

Customers have been using EC2 Auto Scaling to architect sophisticated, scalable, and robust applications on the AWS Cloud for over a decade. Launched in May of 2009, EC2 Auto Scaling is designed to help you maintain application availability by providing three key benefits: improving fault tolerance, increasing application availability, and lowering costs. In a nutshell, EC2 Auto Scaling ensures that your application:

  • Has just the right amount of compute when you need it by detecting and replacing unhealthy instances.
  • Balancing capacity across Availability Zones.
  • Automatically scaling in response to application demand with dynamic and predictive scaling policies.
  • Scaling across purchase options to optimize performance and cost.

The ability to scale fast in response to real-time application demand is critical for many customers. Any latency in the process of adding compute to serve traffic can result in sluggish application response. While EC2 Auto Scaling strives to launch and add Amazon EC2 instances to an Auto Scaling group as fast as possible, some customer applications are bound by unavoidable latency that exists at the application initialization/bootstrap layer. This latency (which can often be several minutes or more) is the result of processes that can only happen at initial boot after a new EC2 instance has moved into the running state. For example, some applications must initiate lengthy processes such as applying updates, performing data or state hydration, or running custom configuration scripts. Customers often attempt to work around this unavoidable latency by over-provisioning in order to absorb unexpected demand increases. Over-provisioning compute resources results in increased cost, skewed scaling metrics and wasted compute. Customers with unavoidable latency living at the initialization layer have asked us for help with solving this problem.

Today, we are excited to tell you about EC2 Auto Scaling Warm Pools. This is a new feature that reduces scale-out latency by maintaining a pool of pre-initialized instances ready to be placed into service. EC2 Auto Scaling Warm Pools works by launching a configured number of EC2 instances in the background, allowing any lengthy application initialization processes to run as necessary, and then stopping those instances until they are needed (customers can also opt to configure Warm Pool instances to be kept in a running state to reduce further scale-out latency). When a scale-out event occurs, EC2 Auto Scaling uses the pre-initialized instances from the Warm Pool rather than launching cold instances, allowing for any final initialization process to run before being placed into service. Warm Pools supports customizations such as configuring the size of the Warm Pool, changing what state to keep your Warm Pool instances in, or enabling lifecycle hooks that let you run additional initialization steps. For applications with long unavoidable initialization times, the immediate benefits can be faster scale-out times and a reduction in cost if over-provisioning was previously being used to compensate for the initialization latency. This post provides an overview of using EC2 Auto Scaling Warm Pools and walks through some example scenarios so you can learn how to take advantage of Warm Pools in your environment right away.

Getting started with Warm Pools in EC2 Auto Scaling

You can enable Warm Pools for your existing or new EC2 Auto Scaling groups with the new PutWarmPool API in the AWS CLI or any AWS SDK. When creating a new Warm Pool, you simply specify the Auto Scaling group name, optional min and max sizes of the Warm Pool, and the state of instances while in the Warm Pool (stopped or running). You can find more detail on the optional parameters of PutWarmPool in our documentation. As instances are launched into a Warm Pool, or moved from the Warm Pool to the Auto Scaling group, you can use lifecycle hooks to perform programmatic actions to prepare your instances for service. These actions could be as simple as installing, configuring, and starting an application. Or you could perform more complex workflows such as registering instances with the primary node of a cluster, updating DNS records, refreshing cached data, or downloading datasets required by your applications.

Warm Pools in action

To get started with Warm Pools, ensure you have access to a workstation with the most recent AWS CLI. Warm Pool API operations are accessible through the  autoscaling group of CLI actions. At launch, three API operations are available for controlling Warm Pool configuration; PutWarmPool (put-warm-pool), DescribeWarmPool (describe-warm-pool), and DeleteWarmPool (delete-warm-pool). You can learn more about these APIs by running aws autoscaling help.

Let’s walk through a simple example and configure a Warm Pool for an Auto Scaling group. In this walkthrough, I target an Auto Scaling group that uses a lifecycle hook to install an application when an instance is launched into the Auto Scaling group or Warm Pool. It then validates that the application is running when the instance is started from a stopped state. You can follow along by deploying this example Auto Scaling group in your account.

Scaling an Auto Scaling group without Warm Pools

With an example Auto Scaling group deployed and set to a desired capacity of 0, I first measure how long it takes to launch an instance directly into my Auto Scaling group and perform my installation and configuration tasks during the instance launch. I compare this with the time it takes to launch pre-initialized instances from our Warm Pool. You should see a significant reduction in launch time by launching pre-initialized instances from our Warm Pool. Pre-initialized instances in a Warm Pool can be launched to serve traffic in as little as 30 seconds.

aws autoscaling set-desired-capacity \ --auto-scaling-group-name "Example Auto Scaling Group" \ --desired-capacity 1

As the instance is launched, a lifecycle hook triggers and the Auto Scaling group waits for a signal from the instance’s user data script to indicate that the application has finished installation.

I can then run the following shell script to calculate the duration of the installation and configuration. This script uses a couple open source tools to parse the JSON response from a DescribeScalingActivities API call. It then calculates the time elapsed between the start and end of the instance launch scaling activity. If you are following along, ensure you have jq and dateutils installed in your environment.

activities=$(aws autoscaling describe-scaling-activities --auto-scaling-group-name "Example Auto Scaling Group")
for row in $(echo "${activities}" | jq -r '.Activities[] | @base64'); do _jq() { echo ${row} | base64 --decode | jq -r ${1} } start_time=$(_jq '.StartTime') end_time=$(_jq '.EndTime') activity=$(_jq '.Description') echo $activity Duration: $(datediff $start_time $end_time)
done

The output of this command shows the duration of the instance launch.

Launching a new EC2 instance: i-075fa0ad6a018cdfc Duration: 243s

Adding a Warm Pool to the Auto Scaling group

When scaling an Auto Scaling group in response to load being placed on your workload, the faster you can scale, the more readily you can serve your customer’s requests. Let’s see how you can improve your launch time by pre-initializing our instances, and placing them in a Warm Pool so they’re ready as additional compute is needed to respond to requests.

The following command configures a Warm Pool for your Auto Scaling group, and set the instances in the Warm Pool to be in a stopped state after they’ve completed their initialization. This helps you optimize the cost of your pre-initialized capacity while maintaining a pool of instances that can be rapidly placed into service.

aws autoscaling put-warm-pool \ --auto-scaling-group-name "Example Auto Scaling Group" \ --pool-state Stopped \ --region us-west-2

Instances launched into the Warm Pool transition through a similar lifecycle as instances launch directly into the Auto Scaling group. This means you can use the same lifecycle hooks you’ve already configured to ensure that your application is installed and started before bringing our instances in-service, and before placing our instances into a Warm Pool.

You can view the state of instances using the DescribeWarmPool API.

aws autoscaling describe-warm-pool \ --auto-scaling-group-name "Example Auto Scaling Group" \ --region us-west-2

When an instance is first launched, it enters a Warmed:Pending state.

{ "WarmPoolConfiguration": { "MinSize": 0, "PoolState": "Stopped" }, "Instances": [ { "InstanceId": "i-0ea10fdc59a07df6e", "InstanceType": "t2.micro", "AvailabilityZone": "us-west-2a", "LifecycleState": "Warmed:Pending", "HealthStatus": "Healthy", "LaunchTemplate": { "LaunchTemplateId": "lt-0356f1c452b0eb0eb", "LaunchTemplateName": "LaunchTemplate_O7hvkiPu9hmf", "Version": "1" } } ]
}

As the application is being installed and started, it enters a Warmed:Pending:Wait state until a CompleteLifecycleAction signal is received.

{ "WarmPoolConfiguration": { "MinSize": 0, "PoolState": "Stopped" }, "Instances": [ { "InstanceId": "i-0ea10fdc59a07df6e", "InstanceType": "t2.micro", "AvailabilityZone": "us-west-2a", "LifecycleState": "Warmed:Pending:Wait", "HealthStatus": "Healthy", "LaunchTemplate": { "LaunchTemplateId": "lt-0356f1c452b0eb0eb", "LaunchTemplateName": "LaunchTemplate_O7hvkiPu9hmf", "Version": "1" } } ]
}

The instance launch then proceeds with a Warmed:Pending:Proceed state.

{ "WarmPoolConfiguration": { "MinSize": 0, "PoolState": "Stopped" }, "Instances": [ { "InstanceId": "i-0ea10fdc59a07df6e", "InstanceType": "t2.micro", "AvailabilityZone": "us-west-2a", "LifecycleState": "Warmed:Pending:Proceed", "HealthStatus": "Healthy", "LaunchTemplate": { "LaunchTemplateId": "lt-0356f1c452b0eb0eb", "LaunchTemplateName": "LaunchTemplate_O7hvkiPu9hmf", "Version": "1" } } ]
}

The instance then finally stops, is pre-warmed and ready to be started and moved to the Auto Scaling group when additional capacity is needed.

{ "WarmPoolConfiguration": { "MinSize": 0, "PoolState": "Stopped" }, "Instances": [ { "InstanceId": "i-0ea10fdc59a07df6e", "InstanceType": "t2.micro", "AvailabilityZone": "us-west-2a", "LifecycleState": "Warmed:Stopped", "HealthStatus": "Healthy", "LaunchTemplate": { "LaunchTemplateId": "lt-0356f1c452b0eb0eb", "LaunchTemplateName": "LaunchTemplate_O7hvkiPu9hmf", "Version": "1" } } ]
}

Using the preceding script provided, you can measure the duration of launching the instance into the Warm Pool and compare this to launching directly into the Auto Scaling group. As you can see from the following results, scaling speed into the Warm Pool is similar to launching directly into the Auto Scaling group because our instance completed the same installation and configuration tasks through the lifecycle hook.

Launching a new EC2 instance into warm pool: i-0ea10fdc59a07df6e Duration: 260s
Launching a new EC2 instance: i-075fa0ad6a018cdfc Duration: 243s

Scaling the Auto Scaling group with Warm Pools

Now that you’ve configured your Warm Pool, you have pre-initialized instances that can be moved in-service without needing to complete the application installation and configuration tasks. Let’s scale-out our Auto Scaling group and see if this improves our scaling speed.

aws autoscaling set-desired-capacity \ --auto-scaling-group-name "Example Auto Scaling Group" \ --desired-capacity 2

The Auto Scaling group launches an additional instance to meet the new desired capacity. Rather than launching a new instance, the pre-initialized instance in your Warm Pool is started and moved in-service. As instances are started, the lifecycle hook triggers, allowing you to perform any configuration required before the instance being moved in-service. For your Example Auto Scaling group, the user data script runs, detects that the application is already installed and configured, and ensures that the application is started. Since the application was already installed when the instance was launched into the Warm Pool, start-up time is significantly improved.

Using your script, you can measure the time it takes to launch the instance from the Warm Pool.

Launching a new EC2 instance from warm pool: i-0ea10fdc59a07df6e Duration: 36s
Launching a new EC2 instance into warm pool: i-0ea10fdc59a07df6e Duration: 260s
Launching a new EC2 instance: i-075fa0ad6a018cdfc Duration: 243s

In this example, launching an instance from the Warm Pool decreased our launch time from over 4 minutes, to just 36 seconds.

How Warm Pools improve scaling policy efficiency

In addition to improving scaling speed, instances placed into a Warm Pool do not contribute to Auto Scaling group metrics that affect scaling policies. Before Warm Pools, if customers wanted to keep pre-initialized instances available for their application, they needed to adjust their scaling policies to account for being over-provisioned. Or they needed to detach pre-initialized instances from the Auto Scaling group to avoid the instances being counted towards Auto Scaling group metrics. By placing your pre-initialized instances into a Warm Pool, your scaling policies can now more accurately reflect the true load being placed on your application. When additional compute is needed to serve traffic, pre-initialized instances can rapidly transition from the Warm Pool into the Auto Scaling group.

Additional Warm Pool and lifecycle hook control with Amazon EventBridge

For more options for lifecycle hooks to help with preparing instances for use, you can create Amazon EventBridge rules that trigger Lambda functions. This is useful if you must perform programmatic actions that include steps that need to be performed outside of an EC2 Instance. With the launch of Warm Pools, additional data is now available in the EC2 Instance-launch Lifecycle Action event to identify the Origin and Destination of instance launches.

The following sample demonstrates an event that is generated when an instance is launching from a Warm Pool into an Auto Scaling group when a lifecycle hook is configured for the autoscaling:EC2_INSTANCE_LAUNCHING lifecycle action. You can use the Origin and Destination fields to perform separate actions based on where the instance is launching from, and where it is launching to. For example: You could install an application as the instance is launching into a Warm Pool, and then start that application when the instance is launching from the Warm Pool into the Auto Scaling group.

You can follow this simple tutorial to get started with Amazon EventBridge and lifecycle hook events.

{ 'version': '0', 'id': '22c76915-84ee-a131-7f2e-f1bad99180d8', 'detail-type': 'EC2 Instance-launch Lifecycle Action', 'source': 'aws.autoscaling', 'account': '[ACCOUNT_ID]', 'time': '2021-03-02T20:46:22Z', 'region': 'us-west-2', 'resources': ['arn:aws:autoscaling:us-west-2:[ACCOUNT_ID]:autoScalingGroup:5b64870a-2427-4c84-8b13-1e12b4a0a28f:autoScalingGroupName/Example Auto Scaling Group'], 'detail': { 'LifecycleActionToken': 'eea5ce6e-5b75-4f67-bf78-af0cb3dd75b6', 'AutoScalingGroupName': 'Example Auto Scaling Group', 'LifecycleHookName': 'app-install-hook', 'EC2InstanceId': 'i-098eadadb4312906e', 'LifecycleTransition': 'autoscaling:EC2_INSTANCE_LAUNCHING', 'Origin': 'WarmPool', 'Destination': 'AutoScalingGroup' }
}

Conclusion

Warm Pools are a great way to accelerate scale-out activities for your Auto Scaling groups. By pre-initializing your instances, they’re readily available to be placed into service when your workload requires additional compute capacity. When combined with lifecycle hooks you have full control over the operation of your applications as instances enter and exit a Warm Pool. If you’re new to lifecycle hooks, we recommend following this tutorial as a next step. If you’re looking for sample code and CloudFormation templates to get you started, we have those available here for several scenarios including Lambda and user data managed lifecycle hooks for Windows and Linux instances.