By Prasad Rao, Partner Solution Architect at AWS
By Nikolay Bunev, Sr. Cloud Consultant at HeleCloud

HeleCloud-Logo-1
HeleCloud-APN-Badge-1
Connect with HeleCloud-1

Running workloads on the cloud can help customers reduce total cost of ownership (TCO) while increasing agility and scalability. When transforming their operations for the cloud, it’s imperative organizations understand cloud security controls and enable them to correctly protect and secure their data.

Within the AWS Shared Responsibility Model, Amazon Web Services (AWS) customers are responsible for securing their workloads within the application layer.

To help with this, AWS provides a variety of tools and features that organizations can leverage when delivering application-level security measures. For example, AWS Identity and Access Management (IAM) manages secure access to AWS services and resources.

However, there are scenarios where instead of relying on IAM for managing access to AWS workloads, it makes sense to rely on conventional passwords or secrets. One such scenario is embedding credentials in an application that accesses a database.

These types of credentials must be regularly rotated to meet internal security mandates and external compliance requirements. While customers could rotate secrets manually, such manual processes are error prone and can bring down applications.

AWS Secrets Manager makes it easier to manage those secrets. It’s a fully managed service that enables customers to rotate secrets safely and automatically. In fact, an automatic secret rotation feature is supported in a number of AWS databases.

HeleCloud combines AWS Secrets Manager and the AWS Systems Manager Run Command into a solution that automatically rotates secrets for databases running on Amazon Elastic Compute Cloud (Amazon EC2). In addition to automatically rotating your secrets, it allows you to access them in applications running on Amazon Elastic Kubernetes Service (Amazon EKS).

In this post, we will explore the HeleCloud solution and walk through the code snippets and steps required to set up automatic credentials rotation of Microsoft (MS) SQL Server running on Amazon EC2.

HeleCloud is an AWS Premier Consulting Partner with AWS Competencies in Security, DevOps, and Financial Services, and a member of the AWS Well-Architected Partner Program.

Customer Scenario

While migrating a customer’s IT infrastructure to AWS, HeleCloud had to migrate a number of MS SQL database workloads from conventional data centers to AWS. These MS SQL databases were accessed from a Python application hosted on Amazon EKS.

The customer needed the AWS solution to have these capabilities:

  • Store the secret securely.
  • Do not hardcode the secret in application configuration files or elsewhere.
  • Rotate the secret on a 90-day basis to comply with internal security policies.
  • Minimize application downtime during the rotation.

The AWS Secrets Manager native solution caters to the databases that are already supported. It can be extended with a multi-user approach that eliminates the downtime entirely by creating a second database user and alternating rotations.

However, as the customer’s SQL database was installed on EC2 and they were using Windows authentication instead of SQL Server authentication to manage their instances, we designed a custom solution to meet the requirements.

Solution Architecture

Figure 1 illustrates HeleCloud’s solution workflow to store and automatically rotate the secret key.

Helecloud AutomateCredentialRotation Fig1 SolutionWorkflow

Figure 1 − Solution workflow.

The following numbers correspond to the numbers in Figure 1:

  1. MS SQL Server on EC2 gets the current secret (AWSCURRENT) during the instance bootstrapping.
    .
  2. The application deployed in EKS accesses the AWSCURRENT secret state.
    .
  3. AWS Secrets Manager, per the rotation policy, initiates the secret rotation by invoking an AWS Lambda function.
    .
  4. The Lambda function creates the new secret in AWSPENDING state in AWS Secrets Manager. It is encrypted with a custom KMS key.
    .
  5. The Lambda function initiates the AWS Systems Manager Run Command script.
    .
  6. The Run Command script updates the SQL password with the newly-created secret in AWSPENDING state.
    .
  7. The Run Command script tests whether the update is successful and returns an exit code to AWS Systems Manager.
    .
  8. The Lambda function parses the exit code from AWS Systems Manager and, if successful, moves the secret state from AWSPENDING to AWSCURRENT.

As the application deployed in Amazon EKS always accesses the AWSCURRENT secret state from AWS Secrets Manager, it begins using the new secret after the rotation.

How We Deployed the HeleCloud Solution Architecture

Let’s deep dive into how we deployed the HeleCloud architecture. We’ll cover these steps:

  1. Creating the secret.
  2. Securing access to the secret.
  3. Accessing the secret.
  4. Rotating the secret.
  5. Automating secret rotation.

Step 1: Creating the Secret in AWS Secrets Manager with Rotation Policy

To create the secret that holds the database credentials, we used this Terraform code snippet:

resource "aws_secretsmanager_secret" "db-secret-rotation" {
  name        = "db-secret-app01"
  description = "This is an example secret"
  kms_key_id  = aws_kms_key.kms_key_db_secret.arn
  policy      = data.template_file.db_secret_policy.rendered   rotation_lambda_arn = aws_lambda_function.db_autorotation.arn
  rotation_rules {
    automatically_after_days = 85
  }
}

This code creates the secret, assigns a resource policy, and sets automatic secret rotation to 85 days. AWS recommends to have a rotation schedule of less than the actual number of days requirement.

In this specific scenario, as the requirement was to rotate the secret every 90 days, we kept the value as 85 days. Even though the secret rotation is part of the AWS Secrets Manager API, it relies on the external Lambda function to call the API.

Therefore, we also pointed to a predefined rotation Amazon Resource Name (ARN) for the Lambda function.

Step 2: Securing Access to the Secret

AWS defines a secret as a resource in AWS Secrets Manager. Access to secrets in AWS Secrets Manager is controlled through secret resource-based policies. You can manage resource-based policies for AWS Secrets Manager with either the AWS Command Line Interface (AWS CLI) or AWS Software Developer Kit (SDK).

This is the resource policy we used to implement our architecture. We encrypted the secret with a custom AWS Key Management Service (KMS) key, so we defined a resource policy to control the access to the key.

Step 3: Accessing the Secret from EC2 and EKS

Next, we needed to define the IAM roles and policies for the end users and the services that require access to the secret. In our scenario, the services were Amazon EC2 (with MS SQL installed) and Amazon EKS. This is their IAM role.

When accessing secrets from EC2, we used the AWS PowerShell tools cmdlets script in Amazon EC2 user data, and then updated the database credentials when the MS SQL instance initialized.

Accessing the secret from EKS requires an external tool to fetch the secret from AWS Secrets Manager. The Kubernetes External Secrets (KES) tool allows us to use external secret management systems like AWS Secrets Manager to securely add secrets to Kubernetes.

The application running on EKS uses the ExternalSecret object supplied by KES. It polls the AWS Secrets Manager and uses the fetched value to connect to the database running on the MS SQL instance. The poll environment variable POLLER_INTERVAL_MILLISECONDS is configurable and is by default set to 10 seconds. It can be adjusted based on the application requirement.

Step 4: Rotating the Secret with AWS Lambda

AWS Secrets Manager uses an AWS Lambda function to rotate the secret. If we use our secret for one of the databases supported by Amazon Relational Database Service (Amazon RDS), then AWS Secrets Manager provides the Lambda function for us.

If we use our secret for another service, we must provide the code for the Lambda function. The functionality built into the Lambda function for rotation breaks down into distinct steps. The Python template is available in the aws-samples repository.

We also created an IAM role with a policy that grants the permissions to the Lambda function so it can access and rotate the secret in AWS Secrets Manager and the other services we use to update the passwords across the SQL instances. These services include AWS Systems Manager Run Command and Amazon Simple Notification Service (SNS).

Step 5: Automating Secret Rotation

The steps of secret rotation are well documented. The challenge in our scenario was to automatically update the password of the database instance when a new secret is generated. There are three main ways to authenticate on Microsoft SQL Server:

  • Connection through Windows authentication:
    • By using local account.
    • By using domain accounts.
  • Connection through SQL Server authentication.
  • Mixed mode authentication.

To change the password of a particular USER role, we needed admin privileges. To get them, we could either authenticate as the SQL system administrator through a SQL command, or we could rely on Windows authentication by using a local account. To do so, we needed to grant admin privileges to that account by making it a member of the SQL sysadmin database server role.

Because we needed to change an SQL LOGIN password remotely by using a local Windows account, we decided to use the AWS Systems Manager Run Command. It let us remotely and securely manage the configuration of the managed instances. Run Command also enables us to automate common administrative tasks and perform ad hoc configuration changes at scale.

If you want to use the AWS Systems Manager Run Command, keep in mind these two prerequisites:

If you want to combine Windows authentication for SQL with the AWS Systems Manager Run Command, be sure sussm-user is a member of the sqladmin role.

As mentioned previously, AWS Secrets Manager relies on an the AWS Lambda function to change the secret. When a secret rotation has been initiated, the Lambda function uses the AWS SecretsManager API to create a new secret in AWSPENDING state.

Our Lambda function for rotation then uses the AWS Systems Manager Run Command, which in boto3 is named send_command(), to execute a Powershell script on the Amazon EC2 instance by using the name tag as a filter.  We also communicate environment variables as input parameters for the script.

ssm_run_command = ssm_client.send_command( Targets=[{'Key': "tag:Name", 'Values': [db_instance_name]}],   DocumentName='AWS-RunPowerShellScript', Comment=f'Change database password update on server {db_instance_name}.', Parameters={'commands': [ f'$env:DBInstanceName = "{db_instance_name}"',      f'$env:SecretId = "{arn}"',      f'$env:SecretName = "{secret_name}"', f'$env:LoginName = "{login_name}"', f'Start-Process powershell "c:/SQLInstall/rotate_db_password.ps1" -NoNewWindow' ]}, CloudWatchOutputConfig={ 'CloudWatchLogGroupName': f'/sqlserver/secretsmanager', 'CloudWatchOutputEnabled': True }
)

As shown in the following Powershell code, we take the AWSPENDING value from AWS Secrets Manager, update the password in the database and, if the password is successfully updated, we set a success flag (PASSWORDUPDATESUCCESSFUL).

$password = (Get-SECSecretValue -SecretId $env:SecretId  -Select SecretString -VersionStage "AWSPENDING")
$srv = New-Object "Microsoft.SqlServer.Management.Smo.Server" $ServerName
if (!$password)
{ Throw "Exiting. We were not able to get the password from AWS Secret Manager"
}
else
{ if ($srv.Logins.Contains($loginName)) { Write-Host -Object "User '$loginName' found. Changing password..." $SQLUser = $srv.Logins | where {$_.Name -eq "$loginName"} $SQLUser.ChangePassword($password); $SQLUser.Alter(); $SQLUser.Refresh(); }else{ Write-Host -Object "Create new sql user '$loginName'" $login = New-Object -TypeName Microsoft.SqlServer.Management.Smo.Login -ArgumentList $ServerName, $loginName $login.LoginType = [Microsoft.SqlServer.Management.Smo.LoginType]::SqlLogin $login.Create($password) } Invoke-Sqlcmd -ServerInstance $ServerName -Username $loginName -Password $password -Query "select @@version" | Out-Null
Write-Host -Object "PASSWORDUPDATESUCCESSFUL"

We use this flag to later check the return status of the PowerShell script by parsing the log output for the flag we set while the script ran:  PASSWORDUPDATESUCCESSFUL

if re.search(r'\bCOMMANDSUCCESSFUL\b', ssm_run_command_stdout): logger.info( "setSecret: Secret rotation RunPowerShellScript has finished successfully on server %s" % db_instance_name) publish_sns(sns_topic_arn, "db_rotate_secret: Lambda: RunPowerShellScript has finished successfully", "Secret rotation RunPowerShellScript has finished successfully on server %s" % db_instance_name) else: ssm_run_command_stderr = ssm_run_command_response['StandardErrorContent'] publish_sns(sns_topic_arn, "db_rotate_secret: Lambda: Secret rotations has failed", "setSecret: RunPowerShellScript has failed with %s." % ssm_run_command_stderr)
service_client.update_secret_version_stage(SecretId=arn, VersionStage="AWSPENDING",
RemoveFromVersionId=token) raise SystemExit("setSecret: RunPowerShellScript has failed with %s.." % ssm_run_command_stderr)

Because AWS Systems Manager APIs are eventually consistent, we need a built-in retry mechanism before we can parse the Run Command response. Depending on the language you choose for your Lambda function, you can do that natively. Or, if using Python, you can use the boto3 waiters.

Conclusion

HeleCloud combines AWS Secrets Manager, the AWS Systems Manager Run Command, and Kubernetes External Secrets, to enable automatic rotation of database credentials. The HeleCloud solution also automatically updates the passwords on the database instance itself.

Automating the credentials rotation removes the possibility of unintended credentials exposure or reuse, making cloud assets more secure.

The entire rotation takes around one minute. Since Kubernetes External Secrets polls are set to 10 seconds, the credentials are automatically updated in the application with very limited interruption to connectivity, especially with the retry mechanism built into the application.

All HeleCloud code snippets and policies in this solution are available in GitHub.

We are interested in hearing how you secure your workloads in cloud, as well as the challenges you face and the solutions that work for you.

.
HeleCloud-APN-Blog-CTA-1
.


HeleCloud – AWS Partner Spotlight

HeleCloud is an AWS Premier Consulting Partner that provides strategic technology consultancy, engineering, and cloud-based managed services.

Contact HeleCloud | Partner Overview

*Already worked with HeleCloud? Rate the Partner

*To review an AWS Partner, you must be a customer that has worked with them directly on a project.