Containers have transformed how companies build and operate software. Bundling both application code and dependencies into a single container image improves agility and reduces deployment failures. But what compute platform should you choose to be most efficient, and what factors should you consider in this decision?
With the release of container image support for AWS Lambda functions (December 2020), customers now have an additional option for building serverless applications using their existing container-oriented tooling and DevOps best practices. In addition, a single container image can be configured to run on both of these compute platforms: AWS Lambda (using serverless functions) or AWS Fargate (using containers).
Three key factors can influence the decision of what platform you use to deploy your container: startup time, task runtime, and cost. That decision may vary each time a task is initiated, as shown in the three scenarios following.
Design considerations for deploying a container
Total task duration consists of startup time and runtime. The startup time of a containerized task is the time required to provision the container compute resource and deploy the container. Task runtime is the time it takes for the application code to complete.
Startup time: Some tasks must complete quickly. For example, when a user waits for a web response, or when a series of tasks is completed in sequential order. In those situations, the total duration time must be minimal. While the application code may be optimized to run faster, startup time depends on the chosen compute platform as well. AWS Fargate container startup time typically takes from 60 to 90 seconds. AWS Lambda initial cold start can take up to 5 seconds. Following that first startup, the same containerized function has negligible startup time.
Task runtime: The amount of time it takes for a task to complete is influenced by the compute resources allocated (vCPU and memory) and application code. AWS Fargate lets you select vCPU and memory size. With AWS Lambda, you define the amount of allocated memory. Lambda then provisions a proportional quantity of vCPU. In both AWS Fargate and AWS Lambda uses, increasing the amount of compute resources may result in faster completion time. However, this will depend on the application. While the additional compute resources incur greater cost, the total duration may be shorter, so the overall cost may also be lower.
AWS Lambda has a maximum limit of 15 minutes of runtime. Lambda shouldn’t be used for these tasks to avoid the likelihood of timeout errors.
Figure 1 illustrates the proportion of startup time to total duration. The initial steepness of each line shows a rapid decrease in startup overhead. This is followed by a flattening out, showing a diminishing rate of efficiency. Startup time delay becomes less impactful as the total job duration increases. Other factors (such as cost) become more significant.
Cost: When making the choice between Fargate and Lambda, it is important to understand the different pricing models. This way, you can make the appropriate selection for your needs.
- AWS Fargate on Amazon ECS pricing is calculated per second based on allocated memory, vCPUs, and storage
- AWS Lambda pricing is calculated per millisecond based on allocated memory and number of function invocations
Figure 2 shows a cost analysis of Lambda vs Fargate. This is for the entire range of configurations for a runtime task. For most of the range of configurable memory, AWS Lambda is more expensive per second than even the most expensive configuration of Fargate.
From a cost perspective, AWS Fargate is more cost-effective for tasks running for several seconds or longer. If cost is the only factor at play, then Fargate would be the better choice. But the savings gained by using Fargate may be offset by the business value gained from the shorter Lambda function startup time.
Dynamically choose your compute platform
In the following scenarios, we show how a single container image can serve multiple use cases. The decision to run a given containerized application on either AWS Lambda or AWS Fargate can be determined at runtime. This decision depends on whether cost, speed, or duration are the priority.
In Figure 3, an image-processing AWS Batch job runs on a nightly schedule, processing tens of thousands of images to extract location information. When run as a batch job, image processing may take 1–2 hours. The job pulls images stored in Amazon Simple Storage Service (S3) and writes the location metadata to Amazon DynamoDB. In this case, AWS Fargate provides a good combination of compute and cost efficiency. An added benefit is that it also supports tasks that exceed 15 minutes. If a single image is submitted for real-time processing, response time is critical. In that case, the same image-processing code can be run on AWS Lambda, using the same container image. Rather than waiting for the next batch process to run, the image is processed immediately.
In Figure 4, a SaaS application uses an AWS Lambda function to allow customers to submit complex text search queries for files stored in an Amazon Elastic File System (EFS) volume. The task should return results quickly, which is an ideal condition for AWS Lambda. However, a small percentage of jobs run much longer than the average, exceeding the maximum duration of 15 minutes.
A straightforward approach to avoid job failure is to initiate an Amazon CloudWatch alarm when the Lambda function times out. CloudWatch alarms can automatically retry the job using Fargate. An alternate approach is to capture historical data and use it to create a machine learning model in Amazon SageMaker. When a new job is initiated, the SageMaker model can predict the time it will take the job to complete. Lambda can use that prediction to route the job to either AWS Lambda or AWS Fargate.
In Figure 5, a customer runs a containerized legacy application that encompasses many different kinds of functions, all related to a recurring data processing workflow. Each function performs a task of varying complexity and duration. These can range from processing data files, updating a database, or submitting machine learning jobs.
Using a container image, one code base can be configured to contain all of the individual functions. Longer running functions, such as data preparation and big data analytics, are routed to Fargate. Shorter duration functions like simple queries can be configured to run using the container image in AWS Lambda. By using AWS Step Functions as an orchestrator, the process can be automated. In this way, a monolithic application can be broken up into a set of “Units of Work” that operate independently.
If your job lasts milliseconds and requires a fast response to provide a good customer experience, use AWS Lambda. If your function is not time-sensitive and runs on the scale of minutes, use AWS Fargate. For tasks that have a total duration of under 15 minutes, customers must decide based on impacts to both business and cost. Select the service that is the most effective serverless compute environment to meet your requirements. The choice can be made manually when a job is scheduled or by using retry logic to switch to the other compute platform if the first option fails. The decision can also be based on a machine learning model trained on historical data.