AWS strives for high availability and has a 99.9% uptime for most services. However, in the rare event that incidents do occur, customers should be prepared to respond. AWS Health is the primary channel to communicate service degradation, scheduled changes, and resource impacting issues. For customers running critical applications, having access to proactive and real-time alerts are key aspects to improve their overall incident remediation processes and maintain operational excellence. Speed and agility are crucial for our customers when it comes to monitoring health events and maintaining the reliability and availability of customer’s applications running on AWS.

AWS Health Aware – AHA is an incident management & communication framework to ingest proactive and real-time alerts from AWS Health to a customer’s preferred communication channels. Customers using AWS Organizations can get aggregated active account level alerts from impacted accounts across their organization. Alerts can be configured to endpoint(s) such as Slack, Microsoft Teams, Amazon Chime and Email Alerts. AHA can also be integrated with a broad range of other endpoints during configuration. These alerts are targeted to give customers event visibility and guidance to help quickly diagnose and resolve issues that are impacting our customer’s applications or workloads.

Benefits of AHA

AHA uses the AWS Health API which is only available to AWS customers who have Business or Enterprise Support plan.

These customers can take advantage of the following features:

  1. Integration with communication platforms such as Slack, Amazon Chime, Microsoft Teams and Email for automated and real time alerts of the AWS incidents.
  2. Integration with Amazon EventBridge, with capability to ingest both Organizational and Non-organizational alerts to an event bus. This help customers integrate with more than 35 SaaS partners such as NewRelic/DataDog/PagerDuty etc.
  3. Aggregated PHD alerts with prescriptive guidance provided by AWS Health.
  4. Visibility into the AWS accounts and resources that are impacted for PHD alerts.
  5. Ability to filter out unwanted alerts by selecting specific region(s).

Scope

Before we get into the deployment, let’s look at the features and architecture of AHA to better understand how it all works together. In Figure 1, the available AWS Health API events are outlined based on whether the customer is utilizing AWS Organizations or not.

There are events, account notification, scheduled change, and investigation types.

Figure 1

Architecture Overview

In Figure 2, the AWS services that are being used to create the real-time alerts for AWS health events are shown via a workflow. A user uploads an AWS CloudFormation Template (CFT) to an AWS account. The CFT grabs the solution from an Amazon S3 Bucket. The CFT then deploys Amazon IAM Roles, an Amazon EventBridge Schedule, AWS Lambda function, webhook URLs in AWS Secrets Manager and an Amazon DynamoDB Table.

 

Diagram shows workflow described in the body of the post.

Figure 2

In this table, we list the resources from the AHA architecture and their purpose.

ResourceDescription
DynamoDDBTableDynamoDB Table used to store Event ARNs, updates and TTL
ChimeChannelSecretWebhook URL for Amazon Chime stored in AWS Secrets Manager
EventBusNameSecretEventBus ARN for Amazon EventBridge stored in AWS Secrets Manager
LambdaExecutionRoleIAM role used for LambdaFunction
LambdaFunctionMain Lambda function that reads from AWS Health API, sends to configured webhook URLs and writes to DynamoDB
LambdaScheduleAmazon EventBridge rule that runs every min to invoke LambdaFunction
LambdaSchedulePermissionIAM Role used for LambdaSchedule
MicrosoftChannelSecretWebhook URL for Microsoft Teams stored in AWS Secrets Manager
SlackChannelSecretWebhook URL for Slack stored in AWS Secrets Manager

EventBridge Extensibility

Amazon EventBridge makes it easy to connect applications together by delivering a stream of real-time data from custom sources, Amazon Web Services (AWS), and software-as-a-service (SaaS) applications.
The data can then be sent to a variety of targets like AWS Lambda, AWS Step Functions, Amazon Kinesis, and many more.
Amazon EventBridge also enables you to connect your applications with a range of SaaS partners without having to worry about building and maintaining custom infrastructure. Customers are using these capabilities to improve the scalability and reliability of their applications by building event-driven architectures rather than tight service coupling.
As more customers have started building end-to-end integrations with EventBridge, AHA would like to provide guidance for customers regarding the best way to get started with the array of different possibilities for event sources and event types.
These use-cases include:

  1. Auditing in near real-time, and archival of historical business operations events.
  2. Visualization and analysis of events for business intelligence and operational purposes.
  3. Automated alerting and remediation of application, service, and infrastructure systems.
  4. Connecting custom workflows with downstream consumers and legacy or on-premises applications.

The visualization depicted in Figure 3 explains the extensibility options of AHA through AWS EventBridge Integrations:

Diagram shows Event bus and Amazon Eventbridge sending to External SaaS providers.

Figure 3

Overall prerequisites

Configuring an endpoint

AHA can send to multiple endpoints (webhook URLs, Email or EventBridge). To use any of these you’ll need to set it up before-hand as some of these are done on 3rd party websites. We’ll go over some of the common ones here.

  • Creating an Amazon Chime Webhook URL (permissions required; Amazon Chime room access, ability to manage webhooks)
    1. Create a new chat room for events (i.e. aws_events)
    2. In the chat room created in step 1, on the gear icon and click manage webhooks and bots
    3. Click Add webhook
    4. Type a name for the bot (i.e. AHOVA Bot) and click Create
    5. Click Copy URL, we will need it for the deployment
  • Creating a Slack Webhook URL (permissions required; add a new channel and app in Slack)
    1. Create a new channel for events (i.e. aws_events)
    2. In your browser go to: workspace-name.slack.com/apps where workspace-name is the name of your Slack Workspace
    3. In the search bar, search for: Incoming Webhooks and click on it
    4. Click on Add to Slack
    5. From the drop down click on the channel you created in step 1 and click Add Incoming Webhooks integration
    6. From this page you can change the name of the webhook (i.e. AWS Bot), the icon/emoji to use, etc.
    7. For the deployment we will need the Webhook URL
  • Creating a Microsoft Teams Webhook URL (permissions required- add a new channel and app in Microsoft Teams)
    1. Create a new channel for events (i.e. aws_events)
    2. Within your Microsoft Team go to Apps
    3. In the search bar, search for: Incoming Webhook and click on it
    4. Click on Add to team
    5. Type in the name of your on the channel your created in step 1 and click Set up a connector
    6. From this page you can change the name of the webhook (i.e. AWS Bot), the icon/emoji to use, etc. Click Create when done
    7. For the deployment we will need the webhook URL that is presented
  • Configuring an Email
    1. You’ll be able to send email alerts to one or many addresses. However, you must first verify the email(s) in the Simple Email Service (SES) console.
    2. AHA utilizes Amazon Simple Email Service (SES) so all you need is to enter in a To: address and a From: address.
    3. You may have to allow a rule in your environment so that the emails don’t get labeled as spam from your email client.
  • Creating a Amazon EventBridge – EventBus
    1. Open the Amazon EventBridge console at https://console.aws.amazon.com/events/
    2. In the navigation pane, choose Event buses
    3. Choose Create event bus
    4. Enter a name for the new event bus
    5. Choose Create

For more information, please refer to the documentation.

Setup

There are two available ways to deploy AHA, both are done via the same CloudFormation template to make deployment as easy as possible.

The two deployment methods for AHA are:

  1. AHA for users NOT using AWS Organizations: Users NOT using AWS Organizations (single accounts) will be able to get Service Health Dashboard (SHD) events and/or Personal Health Dashboard (PHD) events affecting that account ONLY.
  2. AHA for users who ARE using AWS Organizations: Users who ARE using AWS Organizations will be able to get Service Health Dashboard (SHD) events as well as aggregated Personal Health Dashboard (PHD) events for all accounts in their AWS Organization.

AHA without AWS Organizations

Additional prerequisites

  1. A Business or Enterprise Support plan
  2. Have at least 1 endpoint configured (you can have multiple)
  3. Have IAM access to deploy CloudFormation Templates with the following resources: AWS IAM policies, Amazon DynamoDB Tables, AWS Lambda functions, Amazon EventBridge rules and AWS Secrets Manager secrets.

Deployment steps

  1. Clone the AHA package that from Github. If you’re not familiar with the process, here is some documentation.
  2. In the root of the AHApackage you’ll have two files; handler.py and messagegenerator.py. Use your tool of choice to zip them both up and name them with a unique name (e.g. aha-v1.1.zip). Note: Putting the version number in the name will make upgrading AHA seamless.
  3. Upload the .zip you created in Step 1 to an S3 bucket in the same region you plan to deploy this in.
  4. In your AWS console go to CloudFormation
  5. In the CloudFormation console click Create stack, with new resources (standard)
  6. Under Template Source click Upload a template file and click Choose file and select CFN_AHA.yml Click Next
  7. On the Specify stack details page enter in the following information:
    • In Stack name type a stack name (e.g. aha)
    • In AWSOrganizationsEnabled leave it set to default which is No. If you do have AWS Organizations enabled and you want to aggregate across all your accounts, you should be following the step for AHA for users who ARE using AWS Organizations
    • In AWSHealthEventType select whether you want to receive all event types or only issues
    • In S3Bucket type just the bucket name of the S3 bucket used in step 3 (e.g. my-aha-bucket)
    • In S3Key type just the name of the .zip file you created in Step 2 (e.g. aha-v1.1.zip)
    • In the Communications Channels section enter the URLs, Emails and/or EventBus Name of the endpoints you configured previously.
    • In the Email Setup section enter the From and To Email addresses as well as the Email subject. If you aren’t configuring email, just leave it set to the defaults.
    • In EventSearchBack enter in the amount of hours you want to search back for events. Default is 1 hour.
    • In Regions enter in the regions you want to search for events in. Default is all regions. You can filter for up to 10, comma separated (e.g. us-east-1, us-east-2).
    • At the bottom of the page click Next
  8. On the Configure stack options page, scroll to the bottom and click Next
  9. On the Review page, scroll to the bottom and click the checkbox and click Create stack

Wait until Status changes to CREATE_COMPLETE (roughly 2-4 minutes)

AHA with AWS Organizations

Additional prerequisites

  1. A Business or Enterprise Support plan.
  2. Have at least 1 endpoint configured (you can have multiple).
  3. Have IAM access to deploy CloudFormation Templates with the following resources: AWS IAM policies, Amazon DynamoDB Tables, AWS Lambda functions, Amazon EventBridge rules and AWS Secrets Manager secrets.
  4. Enable Health Organizational View from the console, so that you can aggregate all Personal Health Dashboard (PHD) events for all accounts in your AWS organization.

Deployment Steps

  1. Clone the AHA package that from Github. If you’re not familiar with the process, here is some documentation.
  2. In the root of the AHApackage you’ll have two files; handler.py and messagegenerator.py. Use your tool of choice to zip them both up and name them with a unique name (e.g. aha-v1.1.zip). Note: Putting the version number in the name will make upgrading AHA seamless.
  3. Upload the .zip you created in Step 1 to an S3 bucket in the same region you plan to deploy this in.
  4. In your AWS console go to CloudFormation
  5. In the CloudFormation console click Create stack, with new resources (standard)
  6. Under Template Source click Upload a template file and click Choose file and select CFN_AHA.yml Click Next.
  7. On the Specify stack details page enter in the following information:
    • In Stack name type a stack name (e.g. aha)
    • In AWSOrganizationsEnabled change the drop down to Yes. If you do have NOT have AWS Organizations enabled you should be following the step for AHA for users who are NOT using AWS Organizations
    • In AWSHealthEventType select whether you want to receive all event types or only issues
    • In S3Bucket type just the bucket name of the S3 bucket used in step 3 (e.g. my-aha-bucket)
    • In S3Key type just the name of the .zip file you created in Step 2 (e.g. aha-v1.1.zip)
    • In the Communications Channels section enter the URLs, Emails and/or EventBus Name of the endpoints you configured previously
    • In the Email Setup section enter the From and To Email addresses as well as the Email subject. If you aren’t configuring email, just leave it set to the defaults.
    • In EventSearchBack enter in the amount of hours you want to search back for events. Default is 1 hour.
    • In Regions enter in the regions you want to search for events in. Default is all regions. You can filter for up to 10, comma separated (e.g. us-east-1, us-east-2).
    • At the bottom of the page click Next
  8. On the Configure stack options page, scroll to the bottom and click Next
  9. On the Review page, scroll to the bottom and click the checkbox and click Create stack

Wait until Status changes to CREATE_COMPLETE (roughly 2-4 minutes)

Uninstall AHA

Using the AWS Management Console

  1. Sign in to the AWS CloudFormation console
  2. Select this solution’s installation stack.
  3. Choose Delete.

You can also do this in the CLI buy running the following command:

$ aws cloudformation delete-stack --stack-name <stack-name>

Conclusion

In this post you learned how the AWS Health API can be used to alert customers with up to date information about AWS Health events affecting them. You deployed a serverless infrastructure via AWS CloudFormation that sends those alerts to your preferred communication channel(s). You should now be able to proactively monitor and react to AWS Health events for your personal and/or AWS Organizations account(s). To get started, visit the aws-samples Github repository and download AWS Health Aware (AHA).

About the Authors

Mridula Grandhi

Mridula Grandhi is a Principal Technical Account Manager for AWS. Mridula is also a Containers enthusiast and works with AWS customers to design, deploy, and manage their AWS workloads/architectures. You can reach her on Twitter via @gmridula1 (DMs are open).

 

 

 

Jordan Roth

Jordan Roth is a Senior Solution Architect specializing in VMC and Hybrid-Edge for AWS. Jordan assists AWS customers and partners with their cloud migration strategies. In his spare time, he enjoys traveling the globe with his wife, cooking, completing escape rooms, and running around with his two dogs.