Amazon DevOps Guru is a machine learning (ML) powered service that helps developers and operators automatically detect anomalies and improve application availability. DevOps Guru utilizes machine learning models informed by years of Amazon.com and AWS operational excellence in order to identify anomalous application behavior (e.g., increased latency, error rates, resource constraints) and surface critical issues that could cause potential outages or service disruptions. When DevOps Guru identifies critical issues, it automatically sends an alert to Amazon Simple Notification Service (SNS) and Systems Manager OpsCenter, and then provides a summary of related anomalies on the service console. The summary, or insight, includes the likely root cause, as well as context regarding when and where the issue occurred. When possible, DevOps Guru also provides recommendations for remediating the issue.
Since DevOps Guru was announced at re:Invent 2020, AWS has released additional features and expanded the resources covered. Now that it’s GA, we support CloudWatch Agent Container Insights, and an improved dashboard experience that includes a system health summary based on resources, as well as the existing CloudFormation Stacks view.
As a best practice, AWS recommends multi-region and multi-account deployments as your workloads expand in size and complexity. Enabling Amazon DevOps Guru in multi-account along with AWS Organizations offers benefits, such as centrally controlling which AWS resources DevOps Guru monitors across your organization with quick and centralized deployment. This also lets you configure how DevOps Guru notifies the operations team when it detects anomalies and generates insights. You can send the alarms to a SNS topic, create incident items on Systems Manager OpsCenter, or both.
As seen in this post, configuring DevOps Guru for your multi-account environment using AWS CloudFormation stacks was already possible. The new integration with AWS Systems Manager Quick Setup makes it easier and quicker to configure DevOps Guru across accounts in your AWS Organizations. This post walks you through the steps to use this new feature and utilize the real-time visualization of your configuration status provided by Quick Setup.
Quick Setup is a Systems Manager feature that lets you configure and deploy AWS services quickly with the recommended best practices. By utilizing Quick Setup, you can instantly setup services in individual or across multiple AWS accounts and regions within your AWS Organizations.
Instead of manually logging into individual accounts and configuring each region where your applications run, you can use Quick Setup to enable DevOps Guru across your multiple organizational units (OUs) and regions following the AWS best practices. Once you enable DevOps Guru for the selected OUs and regions, it will monitor your applications to provide insights if it detects anomalous behavior. DevOps Guru surfaces the issue by creating an OpsItem on Systems Manager OpsCenter, and by publishing a message on an SNS topic.
The following diagram shows a typical AWS Organization setup, with multiple organizational units containing different AWS accounts. It shows an administrator account where you enable DevOps Guru from Systems Manager Quick Setup, and a dedicated OpsCenter dashboard on each target account in order to collect the anomalies detected by DevOps Guru through OpsItems.
Figure 1: Target Architecture diagram
Specify which AWS resources you want DevOps Guru to analyze by using AWS CloudFormation stacks, or enable it to monitor every resource in your AWS account and region. Either way, in order to detect the anomalies and create insights, DevOps Guru monitors Amazon CloudWatch metrics along with AWS CloudTrail logs that your AWS resources generate.
Before getting started, make sure you have these prerequisites:
- An organization with AWS Organizations. If you are not familiar with AWS Organizations terminology, refer to AWS Organizations terminology and concepts
- Two or more organizational units (OUs)
- One or more target AWS accounts in each OU
- One central (or administrator) account with privileges to manage the target accounts
Enabling DevOps Guru with Systems Manager Quick Setup
Follow these steps to enable DevOps Guru by using Quick Setup:
- Choose a home region– If you are using AWS Quick Setup for the first time, choose home region, which is the region where Quick Setup creates the AWS resources used to deploy your configurations.
Note: You can’t change the home region after you choose it.
Figure 2: Get Started with Quick Setup
- Choose a configuration type– To get started with Quick Setup for DevOps Guru, choose the DevOps Guru service from the available configuration types. These configuration types help you set up the service or feature to use the recommended best practices.
Figure 3: Quick Setup Configuration Type
- Specify configuration options– After setting up a configuration type, you can see the Customize DevOps Guru configuration options By default, the SNS notifications and AWS Systems Manager OpsItems options are enabled. If you want DevOps Guru to analyze all of your organization’s AWS resources in all of your accounts, then select Analyze all AWS resources in all the accounts in my organization.
Note: The configuration options that you choose will apply to every selected AWS account in the Organizational Units and Regions. You’ll be charged by the number of AWS resources hours that DevOps Guru analyzes. Turning on DevOps Guru with such a broad scope can increase your bill. To understand how DevOps Guru will affect your bill, see Estimate Amazon DevOps Guru resource analysis costs to learn how to use our cost estimation tool. Learn more about the costs on DevOps Guru pricing.
Figure 4: Customize DevOps Guru Configuration options
- In the Schedule section, choose how frequently you want Quick Setup to remediate changes made to resources that differ from your configuration. If you want to apply those changes only once, then select the option Default. If you require a different frequency, then select the option Custom, and choose the frequency in the dropdown. If you don’t want Quick Setup to remediate changes, then choose Disable remediation under Custom.
Figure 5: Quick Setup Schedule
- In the Targets section, choose to allow Amazon DevOps Guru to analyze either your organizational units (OUs), or the account you’re logged in to. If you choose Custom, then in the Target OUs section select the check boxes of the OUs and Regions where you want to enable Amazon DevOps Guru.
Figure 6: Quick Setup Targets Organizational Units (OUs) for deployment
- If you choose Current account, then you can select Current Region or Choose Regions from the list in the region where you want to enable Amazon DevOps Guru.
Figure 7: Quick Setup Target AWS Regions for deployment
- At the end you receive a summary based on your configuration selections. Click on the Create button to create the configuration.
Figure 8: Quick Setup Configurations Summary
- After creating your configurations, you will see the details of every configuration.
Figure 9: Quick Setup Configuration List
- To see more configuration details, select any configuration from the list and click View details. Once you are on the Configuration details page, choose to Edit or Delete your configuration. Also, you can Add or Remove OUs and Regions regarding your configurations.
Figure 10: Quick Setup Configurations Details
Integrating DevOps Guru with Systems Manager OpsCenter
AWS Systems Manager OpsCenter provides a centralized location for the operations or IT professionals team to view, investigate, and resolve operational work items, or OpsItems, that are related to AWS resources or services. Each OpsItem provides contextually relevant information, such as the name and ID of the AWS resource that generated the OpsItem, alarm or event details, alarm history, and an alarm timeline graph.
Enable/disable OpsItem when configuring DevOps Guru Configurations (as shown above in step 3). By default, AWS Systems Manager OpsItems is enabled for DevOps Guru Configuration. After creating a configuration, you can still enable/disable OpsItem by using Edit configuration options:
Figure 11: Quick Setup Edit Configuration Options
If you are setting up OpsCenter for the first time, it creates IAM roles and Default rules for OpsItems, as shown below:
Figure 12: AWS SSM OpsCenter Setup
Once DevOps Guru detects anomalous behavior on your application, it generates an insight on the target account, i.e., the account where the affected services run. If you have enabled the OpsItems integration, you will see a link to the OpsItem on the DevOps Guru insight:
Figure 13: Sample DevOps Guru Insights
The OpsItems also appears at the OpsCenter console:
Figure 14: Sample OpsItem created in OpsCenter
Select your respective OpsItem to get more information about the triggered OpsItem:
Figure 15: Sample OpsItem details part-I
When you identify high priority issues, utilize OpsCenter to run Automation runbooks and quickly resolve those issues. By default, OpsCenter tags OpsItems using
value:true. Create or update OpsItem tags as documented on Tagging OpsItems.
Figure 16: Sample OpsItem details part-II
The Similar OpsItems and Related OpsItems features are useful for investigating operational issues while providing context regarding an issue’s scope. The Similar OpsItems feature is a system-generated list of OpsItems that might be related or of interest to you.
Figure 17: Sample OpsItem details part-III
Also, you can update the status of individual OpsItems as shown below:
Figure 18: Sample OpsItem Set Status Options
OpsCenter shows the summary for all open and in progress OpsItems:
Figure 19: OpsCenter Summary Dashboard
This post demonstrated how you can easily enable DevOps Guru on your multi-account organization by utilizing Systems Manager Quick Setup. Simple configuration from a central (administrator) account lets you configure DevOps Guru to monitor your AWS resources on specific organizational units (OUs) and regions. In conjunction with the integration with the Systems Manager OpsCenter, this gives you dashboards that collect the anomalies detected by DevOps Guru.
Now that Amazon DevOps Guru is generally available, for the first three months you can try it on the free tier that includes 7,200 AWS resource hours per month for free on each resource group A and B. Also, you can Estimate Amazon DevOps Guru resource analysis costs from the AWS console. This feature scans selected resources to automatically generate a monthly cost estimate. Furthermore, refer to Gaining operational insights with AIOps using Amazon DevOps Guru to learn more about how DevOps Guru helps you increase your applications’ availability, and check out this workshop for a hands-on walkthrough of DevOps Guru’s main features and capabilities.