Often when working with customers, we guide them by using AWS Budgets and related tools in the AWS platform in order to create cost and utilization guardrails. These tools can be used to conduct advanced, automated, and hands-free actions within your AWS environment – even across multiple accounts. This post will walk you through a fully automated approach to create a forecast-based mechanism in order to alert your developers when their spend is approaching a warning threshold. It will then automatically shut down their EC2 instances if their forecasted spend for the month will exceed a defined value.

This solution utilizes integrations with AWS Organizations and AWS CloudFormation in order to deploy a budget to every account in a specific organizational unit in your organization. In turn, this budget will send notifications through Amazon Simple Notification Service (SNS) when forecasted thresholds are exceeded. Then, we will utilize these SNS notifications to execute an AWS Lambda function that will shut down every EC2 instance that is not tagged as critical in a single region.

Some important notes about this solution:

  • We use a CloudFormation stack as part of a multi-account organization. However, you can also use the stack in a single-account context.
  • The stack presented here is not safe for production environment deployment as-is, and it is intended only for use in a development or test environment. As such, you must be careful and certain of where you deploy it.
  • Utilizing a budget notification with a Lambda function creates an extensible solution that allows nearly limitless possibilities for you to create your own cost-control measures. While you can use this stack as-is, we consider it a good starting-place for far more creative solutions.

Prerequisites

There are two prerequisites for this automated solution to be deployed in accounts within an organization:

  • In order for AWS Budgets to be created, use the management account of your organization to enable Cost Explorer in (see this page for guidance).
  • Trusted access for AWS CloudFormation StackSets must be enabled for your organization (see this page for guidance).

About AWS Budgets

AWS Budgets lets you set custom budgets to track your cost and usage from the simplest to the most complex use cases. AWS Budgets also supports email or SNS notification when actual or forecasted cost and usage exceed your budget threshold, or when your actual Reserved Instance and Savings Plans’ utilization or coverage drops below your desired threshold.

AWS Budgets is also integrated  with AWS Cost Explorer, so that you can easily view and analyze your cost and usage drivers, AWS Chatbot, so that you can receive Budget alerts in your designated Slack channel or Amazon Chime room, and AWS Service Catalog, so that you can track cost on your approved AWS portfolios and products.

Overview of a standard organization

Many customers’ AWS organizations will be similar to the diagram below, with development and production accounts split into discrete organizational units (OUs). Placing accounts into OUs that are mapped to their function lets customers create guardrails around the functionality of these accounts. Typically, these include security controls, such as blocking the provisioning of certain EC2 instance types, or creating resources in specific regions. In our example, we will utilize the Sandbox OU as the root for a budget and associated automation.

An AWS Organization with a Sandbox OU and Workload OU as a hierarchy. Accounts are under both Sandbox OU and Workload OU. Service control policy is applied to Sandbox OU.

Figure 1: A typical AWS organization

Your organization will vary from this example in many ways. However, you can easily substitute a Sandbox OU for one of your own choosing.

Solution overview

AWS Budgets has two features that we will be using:

  1. Multiple budget alerts and thresholds can be created for each AWS account, limited at five.
  2. These alerts can be delivered to an SNS topic, as well as directly to an email address.

As a first step, a warning alert will be delivered to an email address when the forecast spend for an account reaches a threshold of 80%. Then, if an account is forecast to spend 100% of its budget, an email will be delivered again, as well as a Lambda function executed. In turn, this will shut down every EC2 instance in this account where the EC2 instance is not tagged as critical (in the same region where you deploy the solution).

AWS Budget sending SNS notifications to two SNS topics. Warning SNS topic sends email, critical notification topic sends email, and triggers lambda function that act on EC2.

Figure 2: Architecture diagram of our solution

Step 1: Determine your budget and thresholds

Before proceeding, you will must determine the total permissible spend per month for each AWS account. As presented in this blog, the CloudFormation stack will apply the same budget to every account in the same OU. However, this is only a starting point, and you can also adapt the solution to have per-account budgets. See Extending the solution below for more details.

You must also decide what your threshold percentages will be for warnings and budgets. You can select your threshold values, though the stack below has default values of 80% for warnings and 100% for critical values. Having a critical threshold of 200% of forecast budget is a valid approach as well, and many customers will routinely allow their teams to exceed their budgets.

Step 2: Create a service control policy

Before creating our budgets and automation, we will create a Service Control Policy (SCP) that will protect them from modification. The four parts of this policy each enforce that only the account that deployed the stack set can modify it.

  1. Statement1 blocks all roles except for the stack set execution role from modifying a budget.
  2. Statement2 blocks changes to the Lambda functions that are called by the critical budget threshold.
  3. Statement3 blocks changes to the SNS topics for the solution.
  4. Statement4 prevents a user in the account from creating their own IAM role that can modify the previous three statements. This would allow someone with broad IAM privileges to spoof the stack set owner role.

Note that Statement4 is tailored to using CloudFormation stack sets with service-managed permissions. If you wish to proceed with self-managed permissions, then adjust this stanza accordingly. Details regarding using service-managed permissions for CloudFormation are available on this page.

This SCP is applied to the OUs that you wish to attach your budgets to. You must replace the following values in it before deployment:

  • Replace ACCOUNTNUMBER with the account number for the management account that deploys the stack set. This can be either the management account or a delegated administrator account. See Register a delegated administrator for more information regarding delegated accounts for CloudFormation.
  • Replace STACKNAME with the name of the stack set that you will create in CloudFormation.
{ "Version": "2012-10-17", "Statement": [ { "Sid": "Statement1", "Effect": "Deny", "Action": [ "budgets:ModifyBudget", "budgets:UpdateBudgetAction" ], "Resource": [ "*" ], "Condition": { "StringNotLike": { "aws:PrincipalARN": [ "arn:aws:iam::*:role/stacksets-exec-*" ] } } }, { "Sid": "Statement2", "Effect": "Deny", "Action": [ "lambda:DeleteFunction", "lambda:RemovePermission", "lambda:UpdateFunctionCode", "lambda:UpdateFunctionConfiguration", "lambda:UpdateFunctionEventInvokeConfig" ], "Resource": [ "arn:aws:lambda:*:*:function:StackSet-STACKNAME-*-BudgetLambdaFunction-*" ], "Condition": { "StringNotLike": { "aws:PrincipalARN": [ "arn:aws:iam::*:role/stacksets-exec-*" ] } } }, { "Sid": "Statement3", "Effect": "Deny", "Action": [ "sns:DeleteTopic", "sns:AddPermission", "sns:DeleteEndpoint", "sns:RemovePermission", "sns:Unsubscribe" ], "Resource": [ "arn:aws:sns:*:*:StackSet-STACKNAME-*-CriticalTopic-*", "arn:aws:sns:*:*:StackSet-STACKNAME-*-WarningTopic-*" ], "Condition": { "StringNotLike": { "aws:PrincipalARN": [ "arn:aws:iam::*:role/stacksets-exec-*" ] } } }, { "Sid": "Statement4", "Effect": "Deny", "Action": [ "iam:CreateRole", "iam:DeleteRole", "iam:UpdateRole" ], "Resource": [ "arn:aws:iam::*:role/stacksets-exec-*" ], "Condition": { "StringNotLike": { "aws:PrincipalARN": [ "arn:aws:iam::ACCOUNTNUMBER:role/aws-service-role/stacksets.cloudformation.amazonaws.com/AWSServiceRoleForCloudFormationStackSetsOrgAdmin" ] } } } ]
}

To create the service control policy, navigate to AWS Organizations, and select Policies from the left-navigation menu. Under the Supported policy types, select service control policies.

 List of supported policy types for organization with Service Control policy highlighted.

Figure 3: Selecting Service control policies

On the service control policy console, click the Create policy button to create a new service control policy.

Create Policy button on top-right side highlighted.

Figure 4: Creating new policy

Enter a name and description, and paste the policy statements above to the policy editor. Then, click the Create policy button. Remember to replace the ACCOUNTNUMBER and STACKNAME with the values gathered earlier.

New service control policy form with policy name, description, and statements.

Figure 5: Entering policy details

Click the Create Policy button in order to complete the SCP creation.

Next, we will attach the newly created SCP to the target Development Organizational unit where we want the policy statements to be in effect. From the available policies screen, select the newly created policy by clicking the check-box on the left-hand side of the policy name. From the Actions list, select Attach policy.

Select the budget-control-modification-prevention policy, and select attach policy in the actions dropdown.

Figure 6: Attaching the policy

In the following screen, we will select the Development Organizational Unit that would be the target for the policy by clicking the radio-button next to the OU name.

Development OU radio button selected in attach policy screen, and attach policy button highlighted in bottom right.

Figure 7: Specifying the OU to attach the policy

With this SCP created in advance, we have your budget, notifications, and Lambda protected from the moment that they are provisioned.

Step 3: Create your CloudFormation stack set

Now we can create our stack set. The actual CloudFormation stack is below. Review it carefully before deploying, and note these sections:

  • Lines 26-58 create the SNS topics for warnings and alerts.
  • Lines 60-87 create the actual budget and thresholds.
  • Lines 89-158 create the Lambda function and subscription to the critical notification topic.
---
AWSTemplateFormatVersion: '2010-09-09'
Description: Stack that creates an AWS budget, notifications, and a Lambda function that will shut down EC2 instances Parameters: BudgetAmount: Type: Number Description: Maximum permissible spend for the month Email: Type: String Description: Email address to deliver notifications to WarningThreshold: Type: Number Description: Percentage of forecast monthly spend for the warning notification Default: 80 CriticalThreshold: Type: Number Description: Percentage of forecast monthly spend for the critical notification Default: 100 ShutdownExemptionTagKey: Type: String Description: Key name to exempt from auto-shutdown Default: "instance-class" ShutdownExemptionTagValue: Type: String Description: Value of key name tag to exempt from auto-shutdown Default: "critical" Outputs: BudgetId: Value: !Ref Budget Resources: WarningTopic: Type: AWS::SNS::Topic WarningTopicPolicy: Type: AWS::SNS::TopicPolicy Properties: PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: sns:Publish Resource: "*" Principal: Service: budgets.amazonaws.com Topics: - !Ref WarningTopic CriticalTopic: Type: AWS::SNS::Topic CriticalTopicPolicy: Type: AWS::SNS::TopicPolicy Properties: PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: sns:Publish Resource: "*" Principal: Service: budgets.amazonaws.com Topics: - !Ref CriticalTopic Budget: Type: AWS::Budgets::Budget Properties: Budget: BudgetLimit: Amount: !Ref BudgetAmount Unit: USD TimeUnit: MONTHLY BudgetType: COST NotificationsWithSubscribers: - Notification: NotificationType: FORECASTED ComparisonOperator: GREATER_THAN Threshold: !Ref WarningThreshold Subscribers: - SubscriptionType: EMAIL Address: !Ref Email - SubscriptionType: SNS Address: !Ref WarningTopic - Notification: NotificationType: FORECASTED ComparisonOperator: GREATER_THAN Threshold: !Ref CriticalThreshold Subscribers: - SubscriptionType: EMAIL Address: !Ref Email - SubscriptionType: SNS Address: !Ref CriticalTopic BudgetLambdaExecutionRole: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Principal: Service: - lambda.amazonaws.com Action: - sts:AssumeRole Path: / Policies: - PolicyName: BudgetLambdaExecutionRolePolicy PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: - logs:CreateLogGroup - logs:CreateLogStream - logs:PutLogEvents Resource: arn:aws:logs:*:*:log-group:/aws/lambda/*-BudgetLambdaFunction-*:* - Effect: Allow Action: - ec2:DescribeInstances - ec2:StopInstances Resource: arn:aws:ec2:*:*:instance/* BudgetLambdaFunction: Type: AWS::Lambda::Function Properties: Description: Lambda function to be called after a critical budget threshold has been exceeded Handler: index.lambda_handler ReservedConcurrentExecutions: 1 Role: !GetAtt BudgetLambdaExecutionRole.Arn Runtime: python3.8 Timeout: 20 Environment: Variables: ShutdownExemptionTagKey : !Ref ShutdownExemptionTagKey ShutdownExemptionTagValue : !Ref ShutdownExemptionTagValue Code: ZipFile: | import boto3 import os instances = [] def lambda_handler(event, context): ec2 = boto3.resource('ec2') exemption_tag_key = os.environ.get("ShutdownExemptionTagKey") exemption_tag_value = os.environ.get("ShutdownExemptionTagValue") for instance in ec2.instances.all(): print('Found instance: {}, checking instance-class tag value...'.format(instance.id)) instance_class = 'undefined' tags = instance.tags for t in tags: if t["Key"] == exemption_tag_key: instance_class = t["Value"].lower() print(f"Instance type is : {instance_class}") if instance_class != exemption_tag_value: instances.append(instance.id) if len(instances) > 0: print('Calling shutdown API for all discovered instance IDs') response = ec2.instances.stop(InstanceIds=instances) print('Raw response from shutdown API:') print(str(response)) return True CriticalTopicSubscription: Type: AWS::SNS::Subscription Properties: Protocol: lambda TopicArn: !Ref CriticalTopic Endpoint: !GetAtt BudgetLambdaFunction.Arn BudgetLambdaFunctionPermission: Type: AWS::Lambda::Permission Properties: Action: lambda:InvokeFunction FunctionName: !Ref BudgetLambdaFunction Principal: sns.amazonaws.com SourceArn: !Ref CriticalTopic

Deployment of this stack set is best managed using CloudFormation service-managed permissions, as this enables the automatic deployment and removal of stacks as accounts are added to OUs in AWS Organizations. Many of the options, as well as the use of features such as delegated administrator accounts, are at your discretion.

Service-managed permissions selected in configure stackset options.

Figure 8: Selecting Service-managed permission as Stackset permission

Note that the OU ID required will be available within the Organizations console. You must copy this value into the AWS OU ID field when deploying your stack set. Up to 10 OUs can be specified per stack set, and OUs contained therein will inherit these from the parent OU.

Highlighting the OU ID in the Development OU details.

Figure 9: Finding the OU ID

Set Deployment options with field to enter OU ID highlighted.

Figure 10: Specifying target OU for deployment

Clicking the Next button will show the review screen. You must acknowledge that AWS Cloud Formation will create IAM resources. Clicking Submit will create the stack-set and deploy the stack components to the accounts under the target OU. While an AWS budget is global, the actual stack can only be deployed to a single region.

Operating without shutting-down EC2 instances

This solution works well as a notification and monitoring tool, and so you can easily deploy it without the automatic shutdown of EC2 instances. This can be achieved in two ways. You can comment-out every line in the CloudFormation stack after line 88, which leaves the budget capability in place as-is, but no Lambda execution will take place. Alternatively, the CloudFormation template lets you set a tag value for EC2 instances that must be exempted from shutdown. EC2 instances that carry the tag name and tag value specified as the CloudFormation template parameters will be exempted from shutdown. The default value is set in line 23 and line 27 as instance-class/critical. This can be modified to a key/value pair that your organization follows in order to tag critical instances.

Note: The provided CloudFormation stack utilizes the “Forecasted” value to trigger the notification. If the target account is new, then it generally takes some time (typically a few days to a week) for the cost management tool to generate a “forecast” value.

Extending the solution and next steps

This CloudFormation stack is a good starting place for many customers, and it can be extended to perform any number of actions in your environment based on your need. Below are some common alterations that may be useful for you as you are implementing this solution..

First, the EC2 shutdown script can be easily extended to include Amazon RDS instances, Amazon ElastiCache, Amazon Elasticsearch Service, Amazon SageMaker notebooks, or any other number of running resources. We presented EC2 here, as it is ubiquitous, and it is a good reference for your controls. Any actions you can script with Lambda are available to you.

You may wish to have different budget thresholds for each AWS account. You can accomplish this in two ways: one is to modify the SCP to permit another IAM user or role to change the thresholds in Statement1 within your SCP, and then have that person or role change the threshold after the stack has been created. Another option is to use Systems Manager Parameter Store to keep new threshold values, reference them in the stack set, and then update the stack. This page details the embedding parameters from the Systems Manager Parameter Store.

The approach utilized here is fully compatible with AWS Control Tower and the Customizations for Control Tower solution. This solution provides a convenient way to manage the deployment of the service control policy and the stack, all in one place. Likewise, updating the stack and SCP is conducted easily through the pipeline provided by this solution.

In conclusion, utilizing a programmatic approach to controlling developer account costs is straightforward and requires little effort to manage. We recommend that all customers use AWS Budgets wherever possible in order to maintain observability regarding their cloud consumption, thereby utilizing an automatic shutdown mechanism as an evolved way to enforce your own cost control measures.

About the authors

Manoj Subhadevan Profile

Manoj Subhadevan

Manoj Subhadevan is a Senior Solutions Architect for Amazon Web Services Canada. He helps AWS customers to design and implement sophisticated, scalable and secure solutions that solve their business challenges. During his free time, he listens to music and plans to travel the world

Rich McDonough

Rich McDonough

Rich McDonough is a Solutions Architect for Amazon Web Services based in Toronto. His primary focus is on Management and Governance, helping customers scale their use of AWS safely and securely. Before joining AWS in 2018, he specialized in helping migrate customers into the cloud. Rich loves helping customers learn about AWS CloudFormation, AWS Config, and AWS Control Tower.