Automated scaling is an approach to scaling up or down workloads automatically based on resource usage. In Kubernetes, the Horizontal Pod Autoscaler (HPA) can scale pods based on observed CPU utilization and memory usage. In more complex scenarios, we would account for other metrics before deciding the scaling. For example, most web and mobile backends require automated scaling based on requests per second in order to handle traffic bursts. For ETL apps, automated scaling could be triggered by the job queue length exceeding a particular threshold, and so on. Instrumenting your applications with Prometheus and exposing the right metrics for autoscaling lets you fine-tune your apps to handle bursts better and ensure high availability.

Prometheus is an open-source monitoring and alerting toolkit that collects and stores its metrics as time series data. In other words, its metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels. Prometheus Adapter helps query and leverage custom metrics collected by Prometheus, and then utilizes them to make scaling decisions. These metrics are exposed by an API service and can be used readily by Horizontal Pod Autoscaler object.

Managing long-term Prometheus storage infrastructure is challenging.  Therefore, in order to remove the heavy lifting of managing Prometheus, AWS launched Amazon Managed Service for Prometheus , a Prometheus-compatible monitoring service for container infrastructure and application metrics for containers that makes it easy to securely monitor container environments at scale. Amazon Managed Service for Prometheus automatically scales the ingestion, storage, alerting, and querying of operational metrics as workloads scale up and down.

This post will show how to utilize Prometheus Adapter to autoscale Amazon EKS Pods running an Amazon App Mesh workload. AWS App Mesh is a service mesh that makes it easy to monitor and control services. A service mesh is an infrastructure layer dedicated to handling service-to-service communication, usually through an array of lightweight network proxies deployed alongside the application code. We will be registering the custom metric via a Kubernetes API service that HPA will eventually use to make scaling decisions.

Prerequisites

You will need the following to complete the steps in this blog post:

Create an Amazon EKS Cluster

Architecture diagrams shows an AWS Distro for OpenTelemetry (ADOT) scraping metrics from an AppMesh enabled application pod. ADOT sends the scraped the metrics to Amazon Managed Prometheus. Prometheus Adapter deployed in the cluster queries the AMP and register the custom metrics under a custom metrics API server which is then used by the HPA to make the scaling decision

Figure 1: Architecture diagram

We will create a custom metric for the counter exposed by envoy, which is the “envoy_cluster_upstream_rq“. This can be extended to any custom metrics that the application emits.

First, create an Amazon EKS cluster enabled with AWS App Mesh for running the sample application. The eksctl CLI tool will deploy the cluster using the eks-cluster-config.yaml file:

export AMP_EKS_CLUSTER=AMP-EKS-CLUSTER
export AMP_ACCOUNT_ID=<Your Account id>
export AWS_REGION=<Your Region> cat << EOF > eks-cluster-config.yaml
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata: name: $AMP_EKS_CLUSTER region: $AWS_REGION version: '1.18'
iam: withOIDC: true serviceAccounts: - metadata: name: appmesh-controller namespace: appmesh-system labels: {aws-usage: "application"} attachPolicyARNs: - "arn:aws:iam::aws:policy/AWSAppMeshFullAccess"
managedNodeGroups:
- name: default-ng minSize: 1 maxSize: 3 desiredCapacity: 2 labels: {role: mngworker} iam: withAddonPolicies: certManager: true cloudWatch: true appMesh: true
cloudWatch: clusterLogging: enableTypes: ["*"]
EOF

Execute the following command to create the EKS cluster:

eksctl create cluster -f eks-cluster-config.yaml

This creates an Amazon EKS cluster named AMP-EKS-CLUSTER and a service account named appmesh-controller that the AWS App Mesh controller will use for EKS.

Next, use the following commands to install the AppMesh controller.

First, get the Custom Resource Definitions (CRDs) in place:

helm repo add eks https://aws.github.io/eks-charts helm upgrade -i appmesh-controller eks/appmesh-controller \ --namespace appmesh-system \ --set region=${AWS_REGION} \ --set serviceAccount.create=false \ --set serviceAccount.name=appmesh-controller

Step 2: Deploy sample application and enable AWS App Mesh

To install an application and inject an envoy container, use the AWS App Mesh controller for Kubernetes that you created earlier. AWS App Mesh Controller for K8s manages App Mesh resources in your Kubernetes clusters. The controller is accompanied by CRDs that allow you to define AWS App Mesh components, such as meshes and virtual nodes, via the Kubernetes API just as you define native Kubernetes objects, such as deployments and services. These custom resources map to AWS App Mesh API objects that the controller manages for you. The controller watches these custom resources for changes and reflects them into the AWS App Mesh API.

## Install the base application
git clone https://github.com/aws/aws-app-mesh-examples.git kubectl apply -f aws-app-mesh-examples/examples/apps/djapp/1_base_application kubectl get all -n prod ## check the pod status and make sure it is running
## Now install the App Mesh controller and meshify the deployment
kubectl apply -f aws-app-mesh-examples/examples/apps/djapp/2_meshed_application/
kubectl rollout restart deployment -n prod dj jazz-v1 metal-v1
kubectl get all -n prod ## Now we should see two containers running in each pod

Step 3: Create an Amazon Managed Service for Prometheus workspace

The Amazon Managed Service for Prometheus workspace ingests the Prometheus metrics collected from envoy. A workspace is a logical and isolated Prometheus server dedicated to Prometheus resources such as metrics. A workspace supports fine-grained access control for authorizing its management, such as update, list, describe, and delete, as well as ingesting and querying metrics.

aws amp create-workspace --alias AMP-APPMESH --region $AWS_REGION

Next, optionally create an interface VPC endpoint in order to securely access the managed service from resources deployed in your VPC. An Amazon Managed Service for Prometheus public endpoint is also available. This ensures that data ingested by the managed service won’t leave your AWS account VPC. Utilize the AWS CLI as shown here. Replace the placeholder strings, such as VPC_ID, AWS_REGION, with your values.

export VPC_ID=<Your EKS Cluster VPC Id>
aws ec2 create-vpc-endpoint \ --vpc-id $VPC_ID \ --service-name com.amazonaws.<$AWS_REGION>.aps-workspaces \ --security-group-ids <SECURITY_GROUP_IDS> \ --vpc-endpoint-type Interface \ --subnet-ids <SUBNET_IDS>

Step 4: Scrape the metrics using AWS Distro for OpenTelemetry

Amazon Managed Service for Prometheus does not directly scrape operational metrics from containerized workloads in a Kubernetes cluster. You must deploy and manage a Prometheus server or an OpenTelemetry agent such as the AWS Distro for OpenTelemetry Collector or the Grafana Agent in order to perform this task. This post will walk you through the configuring of the AWS Distro for Open Telemetry (ADOT) in order to scrape the envoy metrics. The ADOT-AMP pipeline lets us use the ADOT Collector to scrape a Prometheus-instrumented application, and then send the scraped metrics to Amazon Managed Service for Prometheus.

This post will also walk you through the steps to configure an IAM role to send Prometheus metrics to Amazon Managed Service for Prometheus. We install the ADOT collector on the Amazon EKS cluster and forward metrics to Amazon Managed Service for Prometheus.

Configure permissions

We will be deploying the ADOT collector to run under the identity of a Kubernetes service account “amp-iamproxy-service-account”. With IAM roles for service accounts (IRSA), you can associate the AmazonPrometheusRemoteWriteAccess role with a Kubernetes service account, thereby providing IAM permissions to any pod utilizing the service account to ingest the metrics to Amazon Managed Service for Prometheus.

You need kubectl and eksctl CLI tools in order to run the script. They must be configured to access your Amazon EKS cluster.

kubectl create namespace prometheus
eksctl create iamserviceaccount --name amp-iamproxy-service-account --namespace prometheus --cluster $AMP_EKS_CLUSTER --attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess --approve
export WORKSPACE=$(aws amp list-workspaces | jq -r '.workspaces[] | select(.alias=="AMP-APPMESH").workspaceId')
export REGION=$AWS_REGION
export REMOTE_WRITE_URL="https://aps-workspaces.$REGION.amazonaws.com/workspaces/$WORKSPACE/api/v1/remote_write"

Now create a manifest file, amp-eks-adot-prometheus-daemonset.yaml, with the scrape configuration in order to extract envoy metrics and deploy the ADOT collector. This example deploys a DaemonSet named adot-collector. The adot-collector DaemonSet collects metrics from pods on the cluster.

cat > amp-eks-adot-prometheus-daemonset.yaml <<EOF
---
apiVersion: v1
kind: ConfigMap
metadata: name: adot-collector-conf namespace: prometheus labels: app: aws-adot component: adot-collector-conf
data: adot-collector-config: | receivers: prometheus: config: global: scrape_interval: 15s scrape_timeout: 10s scrape_configs: - job_name: 'appmesh-envoy' metrics_path: /stats/prometheus kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_container_name] action: keep regex: '^envoy$' - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: ${1}:9901 target_label: __address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: namespace - source_labels: ['app'] action: replace target_label: service - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name - job_name: 'kubernetes-service-endpoints' kubernetes_sd_configs: - role: endpoints tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_scrape] action: keep regex: true exporters: awsprometheusremotewrite: # replace this with your endpoint endpoint: "$REMOTE_WRITE_URL" # replace this with your region aws_auth: region: "$REGION" service: "aps" logging: loglevel: info extensions: health_check: pprof: endpoint: :1888 zpages: endpoint: :55679 service: extensions: [pprof, zpages, health_check] pipelines: metrics: receivers: [prometheus] exporters: [logging, awsprometheusremotewrite]
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata: name: adotcol-admin-role
rules: - apiGroups: [""] resources: - nodes - nodes/proxy - services - endpoints - pods verbs: ["get", "list", "watch"] - apiGroups: - extensions resources: - ingresses verbs: ["get", "list", "watch"] - nonResourceURLs: ["/metrics"] verbs: ["get"] ---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata: name: adotcol-admin-role-binding
subjects: - kind: ServiceAccount name: amp-iamproxy-service-account namespace: prometheus
roleRef: kind: ClusterRole name: adotcol-admin-role apiGroup: rbac.authorization.k8s.io ---
apiVersion: v1
kind: Service
metadata: name: adot-collector namespace: prometheus labels: app: aws-adot component: adot-collector
spec: ports: - name: metrics # Default endpoint for querying metrics. port: 8888 selector: component: adot-collector type: NodePort
---
apiVersion: apps/v1
kind: DaemonSet
metadata: name: adot-collector namespace: prometheus labels: app: aws-adot component: adot-collector
spec: selector: matchLabels: app: aws-adot component: adot-collector minReadySeconds: 5 template: metadata: labels: app: aws-adot component: adot-collector spec: serviceAccountName: amp-iamproxy-service-account containers: - command: - "/awscollector" - "--config=/conf/adot-collector-config.yaml" image: public.ecr.aws/aws-observability/aws-otel-collector:latest name: adot-collector resources: limits: cpu: 1 memory: 2Gi requests: cpu: 200m memory: 400Mi ports: - containerPort: 8888 # Default endpoint for querying metrics. volumeMounts: - name: adot-collector-config-vol mountPath: /conf livenessProbe: httpGet: path: / port: 13133 # Health Check extension default port. readinessProbe: httpGet: path: / port: 13133 # Health Check extension default port. volumes: - configMap: name: adot-collector-conf items: - key: adot-collector-config path: adot-collector-config.yaml name: adot-collector-config-vol
---
EOF kubectl apply -f amp-eks-adot-prometheus-daemonset.yaml

After the ADOT collector is deployed, it will collect the metrics and ingest them into the specified Amazon Managed Service for Prometheus workspace. The scrape configuration is similar to that of a Prometheus server. We will add the necessary configuration for scraping envoy metrics.

Step 5: Deploy the Prometheus Adapter to register custom metric

We will be creating a serviceaccount “monitoring” that will be used to run the Prometheus adapter. We will also be assigning the AmazonPrometheusQueryAccess permission using IRSA.

kubectl create namespace monitoring
eksctl create iamserviceaccount --name monitoring --namespace monitoring --cluster AMP-EKS-CLUSTER --attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusQueryAccess --approve --override-existing-serviceaccounts cat > pma-cm.yaml << EOF
apiVersion: v1
kind: ConfigMap
metadata: name: adapter-config namespace: monitoring
data: config.yaml: | rules: - seriesQuery: 'envoy_cluster_upstream_rq {namespace!="",kubernetes_pod_name!=""}' resources: overrides: namespace: {resource: "namespace"} kubernetes_pod_name: {resource: "pod"} name: matches: "envoy_cluster_upstream_rq " as: "appmesh_requests_per_second" metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)' EOF kubectl apply -f pma-cm.yaml openssl req -new -newkey rsa:4096 -x509 -sha256 -days 365 -nodes -out serving.crt -keyout serving.key -subj "/C=CN/CN=custom-metrics-apiserver.monitoring.svc.cluster.local"
kubectl create secret generic -n monitoring cm-adapter-serving-certs --from-file=serving.key=./serving.key --from-file=serving.crt=./serving.crt 

The Envoy sidecar utilized by AWS App Mesh exposes a counter envoy_cluster_upstream_rq_total. You can configure the Prometheus adapter to transform this metric into req/sec rate. Below is the Prometheus adapter configuration information. The adapter will be connecting to the Amazon Managed Service for Prometheus’s query endpoint through sigv4 proxy.

We will now deploy the Prometheus adapter to create the custom metric:

cat > prometheus-adapter.yaml <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata: name: custom-metrics-resource-reader
rules:
- apiGroups: - "" resources: - pods - nodes - nodes/stats verbs: - get - list - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata: name: custom-metrics-resource-reader
roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: custom-metrics-resource-reader
subjects:
- kind: ServiceAccount name: monitoring namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata: name: custom-metrics:system:auth-delegator
roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator
subjects:
- kind: ServiceAccount name: monitoring namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata: name: custom-metrics-auth-reader namespace: kube-system
roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount name: monitoring namespace: monitoring
---
apiVersion: apps/v1
kind: Deployment
metadata: labels: app: custom-metrics-apiserver name: custom-metrics-apiserver namespace: monitoring
spec: replicas: 1 selector: matchLabels: app: custom-metrics-apiserver template: metadata: labels: app: custom-metrics-apiserver name: custom-metrics-apiserver spec: serviceAccountName: monitoring containers: - name: custom-metrics-apiserver image: directxman12/k8s-prometheus-adapter-amd64 args: - /adapter - --secure-port=6443 - --tls-cert-file=/var/run/serving-cert/serving.crt - --tls-private-key-file=/var/run/serving-cert/serving.key - --logtostderr=true - --prometheus-url=http://localhost:8080/workspaces/$WORKSPACE - --metrics-relist-interval=30s - --v=10 - --config=/etc/adapter/config.yaml ports: - containerPort: 6443 volumeMounts: - mountPath: /var/run/serving-cert name: volume-serving-cert readOnly: true - mountPath: /etc/adapter/ name: config readOnly: true - name: aws-iamproxy image: public.ecr.aws/aws-observability/aws-sigv4-proxy:1.0 args: - --name - aps - --region - us-east-1 - --host - aps-workspaces.us-east-1.amazonaws.com ports: - containerPort: 8080 volumes: - name: volume-serving-cert secret: secretName: cm-adapter-serving-certs - name: config configMap: name: adapter-config
---
apiVersion: v1
kind: Service
metadata: name: custom-metrics-apiserver namespace: monitoring
spec: ports: - port: 443 targetPort: 6443 selector: app: custom-metrics-apiserver
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata: name: v1beta1.custom.metrics.k8s.io
spec: service: name: custom-metrics-apiserver namespace: monitoring group: custom.metrics.k8s.io version: v1beta1 insecureSkipTLSVerify: true groupPriorityMinimum: 100 versionPriority: 100 EOF
kubectl apply -f prometheus-adapter.yaml

We will create an API service so that our Prometheus adapter is accessible by Kubernetes API. Therefore, metrics can be fetched by our Horizontal Pod Autoscaler. We can query the custom metric API to see if the metric has been created.

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 |jq .
{ "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "custom.metrics.k8s.io/v1beta1", "resources": [ { "name": "pods/appmesh_requests_per_second", "singularName": "", "namespaced": true, "kind": "MetricValueList", "verbs": [ "get" ] }, { "name": "namespaces/appmesh_requests_per_second", "singularName": "", "namespaced": false, "kind": "MetricValueList", "verbs": [ "get" ] } ]
}

Now you can use the appmesh_requests_per_second metric in the HPA definition with the following HPA resource:

cat > hpa.yaml <<EOF
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata: name: envoy-hpa namespace: prod
spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: jazz-v1 minReplicas: 1 maxReplicas: 10 metrics: - type: Pods pods: metricName: appmesh_requests_per_second targetAverageValue: 10m
EOF
kubectl apply -f hpa.yaml

Now, we will be able to scale the pods when the threshold for the metric “appmesh_request_per_second” exceeds 10.

Let us add some load to experience the autoscaling actions:

dj_pod=`kubectl get pod -n prod --no-headers -l app=dj -o jsonpath='{.items[*].metadata.name}'`
loop_counter=0
while [ $loop_counter -le 300 ] ; do kubectl exec -n prod -it $dj_pod -c dj -- curl jazz.prod.svc.cluster.local:9080 ; echo ; loop_counter=$[$loop_counter+1] ; done

Describing the HPA will show the scaling actions resulting from the load we introduced.

kubectl describe hpa -n prod Name: envoy-hpa
Namespace: prod
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 06 Sep 2021 04:19:37 +0000
Reference: Deployment/jazz-v1
Metrics: ( current / target ) "appmesh_requests_per_second" on pods: 622m / 10m
Min replicas: 1
Max replicas: 10
Deployment pods: 8 current / 10 desired
Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 10 ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric appmesh_requests_per_second ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedGetPodsMetric 58m (x44 over 69m) horizontal-pod-autoscaler unable to get metric appmesh_requests_per_second: unable to fetch metrics from custom metrics API: the server could not find the metric appmesh_requests_per_second for pods Normal SuccessfulRescale 41s horizontal-pod-autoscaler New size: 4; reason: pods metric appmesh_requests_per_second above target Normal SuccessfulRescale 26s horizontal-pod-autoscaler New size: 8; reason: pods metric appmesh_requests_per_second above target Normal SuccessfulRescale 11s horizontal-pod-autoscaler New size: 10; reason: pods metric appmesh_requests_per_second above target

Clean-up

Use the following commands to delete resources created during this post:

aws amp delete-workspace --workspace-id $WORKSPACE
eksctl delete cluster $AMP_EKS_CLUSTER

Conclusion

This blog demonstrated how we can utilize Prometheus Adapter to autoscale deployments based on some custom metrics. For the sake of simplicity, we have only fetched one metric from AMP. However, the Adapter configmap can be extended to fetch some or all of the available metrics and utilize them for autoscaling.

Further Reading

About the author

Vikram Venkataraman Profile

Vikram Venkataraman

Vikram Venkataraman is a Senior Technical Account Manager at Amazon Web Services and also a container enthusiast. He helps organization with best practices for running workloads on AWS. In his spare time, he loves to play with his two kids and follows Cricket.