Amazon Redshift is the most widely used cloud data warehouse. It allows you to run complex analytic queries against terabytes to petabytes of structured and semi-structured data, using sophisticated query optimization, columnar on high-performance storage, and massively parallel query execution.
At AWS, we embrace the culture that security is job zero, by which we mean it’s even more important than any number one priority. AWS provides comprehensive security capabilities to satisfy the most demanding requirements, and Amazon Redshift provides data security out of the box at no extra cost. Amazon Redshift uses the AWS security frameworks to implement industry-leading security in the areas of authentication, access control, auditing, logging, compliance, data protection, and network security.
Cybersecurity Awareness Month raises awareness about the importance of cybersecurity, ensuring everyone has the resources they need to be safer and more secure online. This post highlights some of the key out-of-the-box capabilities available in Amazon Redshift to manage your data securely.
Amazon Redshift supports industry-leading security with built-in AWS Identity and Access Management (IAM) integration, identity federation for single sign-on (SSO), and multi-factor authentication. You can federate database user authentication easily with IAM and Amazon Redshift using IAM and a third-party SAML-2.0 identity provider (IdP), such as AD FS, PingFederate, or Okta.
To get started, see the following posts:
- Federate Amazon Redshift access with Okta as an identity provider
- Federate Amazon Redshift access with Microsoft Azure AD single sign-on
- Federating single sign-on access to your Amazon Redshift cluster with PingIdentity
- Federate access to your Amazon Redshift cluster with Active Directory Federation Services (AD FS): Part 1 and Part 2
- Amazon Redshift identity federation with multi-factor authentication
- Federated authentication to Amazon Redshift using AWS Single Sign-On .
Granular row and column-level security controls ensure users see only the data they should have access to. You can achieve column-level access control for data in Amazon Redshift by using the column-level grant and revoke statements without having to implement views-based access control or use another system. Amazon Redshift is also integrated with AWS Lake Formation, which makes sure that column and row level access in Lake Formation is enforced for Amazon Redshift queries on the data in the data lake.
The following guides can help you implement fine-grained access control on Amazon Redshift:
- Using Redshift Spectrum with AWS Lake Formation
- Restrict Amazon Redshift Spectrum external table access to Amazon Redshift IAM users and groups using role chaining
- Achieve finer-grained data security with column-level access control in Amazon Redshift
- Column Level Access Control in Amazon Redshift
Auditing and logging
To monitor Amazon Redshift for any suspicious activities, you can take advantage of the auditing and logging features. Amazon Redshift logs information about connections and user activities, which can be uploaded to Amazon Simple Storage Service (Amazon S3) if you enable the audit logging feature. The API calls to Amazon Redshift are logged to AWS CloudTrail, and you can create a log trail by configuring CloudTrail to upload to Amazon S3. For more details, see Database audit logging, Analyze logs using Amazon Redshift spectrum, Querying AWS CloudTrail Logs, and System object persistence utility.
Amazon Redshift is assessed by third-party auditors for compliance with multiple programs. If your use of Amazon Redshift is subject to compliance with standards like HIPAA, PCI, or FedRAMP, you can find more details at Compliance validation for Amazon Redshift.
To protect data both at rest and while in transit, Amazon Redshift provides options to encrypt the data. Although the encryption settings are optional, we highly recommend enabling them. When you enable encryption at rest for your cluster, it encrypts both the data blocks as well as the metadata, and there are multiple ways to manage the encryption key (see Amazon Redshift database encryption). To safeguard your data while it’s in transit from your SQL clients to the Amazon Redshift cluster, we highly recommend configuring the security options as described in Configuring security options for connections.
For additional data protection options, see the following resources:
- Client-side encryption
- Internetwork traffic privacy
- Data tokenization
- Encrypt Your Amazon Redshift Loads with Amazon S3 and AWS KMS
Amazon Redshift enables you to use an AWS Lambda function as a UDF in Amazon Redshift. You can write Lambda UDFs to enable external tokenization of data dynamic data masking, as illustrated in Amazon Redshift – Dynamic Data Masking.
Amazon Redshift is a service that runs within your VPC. There are multiple configurations to ensure access to your Amazon Redshift cluster is secured, whether the connection is from an application within your VPC or an on-premises system. For more information, see VPCs and subnets. Amazon Redshift for AWS PrivateLink ensures that all API calls from your VPC to Amazon Redshift stay within the AWS network. For more information, see Connecting to Amazon Redshift using an interface VPC endpoint.
Customer success stories
You can run your security-demanding analytical workload using out-of-the-box features. For example, SoePay, a Hong Kong–based payments solutions provider, uses AWS Fargate and Amazon Elastic Container Service (Amazon ECS) to scale its infrastructure, AWS Key Management Service (AWS KMS) to manage cryptographic keys, and Amazon Redshift to store data from merchants’ smart devices.
With AWS services, GE Renewable Energy has created a data lake where it collects and analyses machine data captured at GE wind turbines around the world. GE relies on Amazon S3 to store and protect its ever-expanding collection of wind turbine data and Amazon Redshift to help them gain new insights from the data it collects.
For more customer stories, see Amazon Redshift customers.
Conclusion and Next Steps
In this post, we discussed some of the key out-of-the-box capabilities at no extra cost available in Amazon Redshift to manage your data securely, such as authentication, access control, auditing, logging, compliance, data protection, and network security.
You should periodically review your AWS workloads to ensure security best practices have been implemented. The AWS Well-Architected Framework helps you understand the pros and cons of decisions you make while building systems on AWS. This framework can help you learn architectural best practices for designing and operating reliable, secure, efficient, and cost-effective systems in the cloud. Review your security pillar provided in this framework.
In addition, AWS Security Hub, an AWS service, provides a comprehensive view of your security state within AWS that helps you check your compliance with security industry standards and best practices.
To adhere to the security needs of your organization, you can automate the deployment of an Amazon Redshift cluster in an AWS account using AWS CloudFormation and AWS Service Catalog. For more information, see Automate Amazon Redshift cluster creation using AWS CloudFormation and Automate Amazon Redshift Cluster management operations using AWS CloudFormation.
About the Authors
Kunal Deep Singh is a Software Development Manager at Amazon Web Services (AWS) and leads development of security features for Amazon Redshift. Prior to AWS he has worked at Amazon Ads and Microsoft Azure. He is passionate about building customer solutions for cloud, data and security.
Thiyagarajan Arumugam is a Principal Solutions Architect at Amazon Web Services and designs customer architectures to process data at scale. Prior to AWS, he built data warehouse solutions at Amazon.com. In his free time, he enjoys all outdoor sports and practices the Indian classical drum mridangam.