Behind every successful open source project, you’ll find a real problem that needed to be solved. In this post, I will explore one such example through the backstory of the AWS Cloud Development Kit, or AWS CDK for short.
A big part of this story involves the impact of the Amazon culture and our approach to open source. At Amazon, we have a set of 14 Leadership Principles (LPs) that guide our daily interactions. We create and use mechanisms that provide complete processes and tenets to guide a team’s decision making. Working backwards from the real challenges that AWS CDK addressed, I will share how the project incorporated these principles to set itself up for success and create an open source culture. I’ll also discuss how open source projects can help drive innovation and new ways of thinking.
AWS Cloud Development Kit
If you are not familiar with the AWS CDK, it is an open source software development framework that lets developers define cloud application resources using familiar programming languages, so they do not have to learn a new domain-specific language. The AWS CDK lets developers manage those resources in the same way that they manage their application code, ensuring visibility, reproducibility, and repeatability, and it is part of how modern applications treat their infrastructure as code.
If you are interested in taking AWS CDK for a spin, I highly recommend the CDK Workshop, which is a fun tutorial that will walk you through first steps with the AWS CDK.
So, what was the catalyst for the AWS CDK?
A few years ago, we were re-architecting our search service to better identify hot trending products. This new event-driven architecture was built using AWS services (for example, Amazon DynamoDB, AWS Lambda, and Amazon Kinesis) deployed in multiple environments across the AWS global infrastructure. The team wanted to architect and build this in a modular way so they could develop and test in isolation and evolve independently, if necessary.
AWS CloudFormation was used to reliably and consistently provision the resources they needed, but the team discovered an unmet need. Although AWS CloudFormation was the right tool for provisioning resources, the team felt that using YAML/JSON was not the right approach for describing their system. AWS CloudFormation templates are basically a flat list of resources and their configuration. They don’t include tools for expressing abstract ideas such as “the injection pipeline” or the “storage layer” or a “dynamodb scanner.”
Rather than limit themselves, the team followed Amazon’s Invent and Simplify principle and came up with a high-level, object-orientated abstraction that allowed them to work with the power of AWS CloudFormation but use Java instead of JSON. The team then created classes that modeled individual AWS resources, and, using a programming model called “constructs,” they could quickly assemble these into reusable high-level concepts. Additionally, this approach made it possible to use the same programming language for the application and the infrastructure, thereby reducing the learning curve for developers and allowing them to use techniques like unit tests to improve quality. The project was delivered ahead of schedule and served as a proof of concept for the approach that led to the creation of the AWS CDK.
At Amazon, we use the PRFAQ mechanism to “work backwards” from customer needs when we create new products and services. As is often the case, we found that AWS customers have similar challenges and needs, which led to this internal library becoming a new AWS product.
Customers could relate to the pain points that AWS CDK was looking to solve, which helped drive initial internal adoption and usage. At first, this growth was organic, and it extended when the product was released externally to customers, first, as a private beta and, soon after, as a public open source project.
The growth we are seeing is not limited to using AWS CDK, but also includes contributing AWS CDK constructs and collections of patterns that can be used to accelerate your solutions.
Setting up for success
Because the AWS CDK was designed as an open source project from day one, another tenet was to fully distribute the team to ensure no differences between external contributors and internal ones. From the beginning, the AWS CDK team has had people across the world, with the first hire in the Netherlands while the project lead was in the middle of moving between San Francisco and Tel Aviv. Subsequent team members from other countries helped maintain this distributed culture, and, indeed, the AWS CDK has hundreds of contributors per month, both external to Amazon and from all across the company.
Additionally, it was clear from the start that we wanted contributions to be easy to make, and the approach we took was to streamline that process within the project’s workflows. That meant understanding the tools that external contributors would be using (for example, GitHub) and following the standard open source development approach (for example, starting with creating issues, raising pull requests, etc.). These forcing functions meant that the team dealt with external contributions to the project in the same way as contributions by team members or any AWS employee.
Creating a culture that cultivates contribution
At Amazon, we have a strong ownership culture embodied in another of our LPs: Ownership. We have a natural tension between that strong ownership culture and the collaborative nature of open source development. This tension is normal, however, as it provides a mechanisms by which we can balance and improve conflicting priorities over time.
One of the ways we addressed the tension was by getting internal contributors to work in the open and raise issues and pull requests, which has really increased the number of internal engineers who want to contribute to external projects. This also meant reducing friction points to make contributions easier. This goal, in turn, led to another tension point and another Amazon LP: Insist on the Highest Standards. Making contributions easier must be balanced by maintaining and raising the bar on quality and contributions.
We saw rapid contributions and community growth driven as AWS CDK was seen as a “must-have app.” We also saw a rapid rise in the activity of the internal Slack channel with more than 4,500 members along with new developers contributing for the first time. The AWS CDK team still works hard to make sure that the contribution experience is great. We have invested in the build systems and development environments and ensured that the documentation is clear, so you can get started quickly. We have also worked hard on being responsive when issues and pull requests are raised. Being responsive turned out to be a key factor in making the contribution experience sticky and ensuring developers return.
When you get all of these things right, you generate a flywheel that accelerates your collaboration and contribution growth. And, we were able to cultivate the internal culture needed to be successful externally.
Because we don’t know or understand everything, a key part of this journey involves the opportunities to learn and evolve how we engage with the community and customers around AWS CDK. I want to share a couple of examples.
One of the first areas we needed to understand involved setting expectations with the community about the maturity of the project. If you look at how open source projects are orchestrated, many have well-defined and well-understood semantics around the project life cycle—things like the quality levels of code produced, how to ensure production readiness, how a project graduate between stages, and more. This area continues to be something that we are learning from within the team and from other AWS teams who are developing open source projects.
Another area that we have continued to learn from is how to balance new features with innovation in a project that is becoming more and more stable over time. Initially, new features were released using experimental labels, but we didn’t want customers to trip over this later. This concern continues to drive how we think and change our approach within AWS CDK, so watch this space.
The final thing we realized was that it was critical to make the foundation on which these external libraries and third parties depend rock solid, and to make sure people can depend on that foundation.
The CDK community
During the keynote at the 2020 AWS Summit Online Americas, Werner Vogels, CTO of Amazon, shared stories of how customers are using and contributing to AWS CDK. We learned, for example, that Liberty Mutual has developed and deployed more than 1000 applications in non-production and more than 100 in production with AWS CDK. Liberty Mutual said that the benefits to them were saving time in deploying applications, ensuring the implementation of best practices, and driving developer collaboration with the sharing of reusable patterns. (Read the blog post “The CDK Patterns open source journey” to learn more.)
Forty-eight percent of the commits into the CDK codebase originate from external contributions. The project is active with customers and the community starting to create their own constructs as well as coming together to build open source repositories of reusable patterns that allow developers to quickly bootstrap projects.
What is also interesting is how the AWS CDK project is evolving and addressing other unmet needs. As Vogels put it in his keynote, “One day, I see that CDK will become the standard way of writing cloud applications.” When you think about the core idea behind AWS CDK, you are composing resources using code and then synthesizing those into configuration that the system can understand and use. This structure has allowed CDK8s (CDK for Kubernetes) and CDKtf (CDK for Terraform) to adopt the same approach and synthesize Kubernetes and HashiCorp’s Terraform configuration files.
It’s also interesting to see how the community is driving new applications of the AWS CDK technology. At a recent CDK Day, for example, more than a thousand builders gathered from all over the world to share proof of concepts, such as how to use CDK to deploy Docker applications, a CDK that created Azure configuration files, and projects such as projen that focus on the toolchain developers use when starting projects.