Field Notes provides hands-on technical guidance from AWS Solutions Architects, consultants, and technical account managers, based on their experiences in the field solving real-world business problems for customers.

In this case study, we share our experience delivering results for remote customer engagements during the COVID-19 outbreak in China in Q1 2020.

We reduced two months of expected COVID-19 delay to two weeks’ delay by deploying a distributed remote Agile/AWS Engagement Delivery Framework (AWS EDF). We share three major takeaways in this blog post:

  1. Invest in your distributed team. Remote delivery costs more and requires preparation and setup effort, especially during your first remote sprint.
  2. Respond to challenges and adapt your setup accordingly. Don’t just stick to a playbook.
  3. Be diligent with your sprint ceremonies. This helps eliminate any communication overhead that could derail your progress.

Situation

In August 2019, we engaged one of the largest joint venture commercial banks in the Greater China Region and began to modernize their Merchant Service System into a microservice-based system. We also began to enable the customer team to be self-sufficient on microservices, using the two-pizza Agile/EDF working model. Our consultants’ main job was to work out tech spike stories, share our findings, and teach customer development teams, while also coaching Agile/EDF at the sprint and program level to the whole project team.

The customer’s team members were distributed across three cities (HangZhou, ChengDu, ShenZhen) and four offices. They were organized around three microservice teams, including customer PgM/PO/BA, UI, frontend, backend, SysTest, UAT, and various other customer teams specializing in spring-boot-like framework, PaaS, or CI/CD pipeline, etc. The backend Developer and program-level Product Owner (PO) led the entire development progress and process.

When COVID-19 broke out in China, it had a two-month impact on the engagement and introduced several constraints. For example, all team members were required to work from home (WFH), we had no access to the customer’s Dev/IT environment, we had no remote collaboration tools, and one consultant’s hometown was locked down. We had to find a solution to work around these constraints. Otherwise, we would have been forced to bring the engagement to a halt. Additionally, the customer tightened the guidance that the engagement delivery date could not be changed, and that no one was to return to the office until the government lifted its COVID-19 regulations.

Action

We decided to change our delivery model from onsite to remote. This changed how we engaged with the customer, how we collaborated internally, and how we augmented our team to ensure that our remote work was effective.

Methodology adaptation: deploying remote and distributed agile and EDF

Before COVID-19, we were using Agile/EDF for delivery and enablement. After COVID-19, we enhanced and evolved our existing sprint ceremonies, and added more interactions for remote and distributed teams. Refer to Table 1 for a quick reference.

  1. Daily morning standup: For whole-project members to discuss key tasks, progress, and blockers. Not for the backend-specific tech tasks. Started at 40 minutes at the beginning of the WFH period, and evolved to 15-20 minutes.
  2. Added one daily late afternoon standup: Only for the backend team, focusing on more details about the backend specific tasks, progress, blockers, and AWS consultants’ tech support progress. Started at 30 minutes at the beginning of WFH, and evolved to <15 minutes.
  3. Sprint day-6 for Sprint+1’s replenish/grooming: (DoR start, took about 60 minutes) and enforced with quick check (DoR check ~ 15 min on Day-10) for Sprint+1’s requirement’s readiness. DoR = Definition of Done, referring to requirements’ definition readiness in Agile terminology.
  4. Sprint day-10 for review/retrospective: This included: (a) Review in turn by each microservice team for sprint incremental, and (b) Retrospective including every backend dev and proxy for each other functional role. Took around 1.5 hours.
  5. Sprint +1 day-1 for planning: This includes: (a) Sprint +1 scope confirmation at Acceptance Criteria level and other supporting material; (b) Each story had its subtasks broken down with deadline planning, especially on key handover tasks, for example, backend API definition (Day-2), Front/Backend completion (Day-5), ST complete (Day-10); and (c) Late on day-1, sprint planning result check and take follow-up actions offline. Took around 1 hour.
  6. Regular remote AWS consultants’ internal sync-up: (a) Daily sync-up right after the morning standup < 15 minutes; (b) Biweekly AWS tech tasks replenish ~30 minutes; (c) Biweekly AWS tech task planning ~1 hour. All these were supplemented by ad hoc Instant Messaging group sync-ups.
  7. Remote weekly enablement meeting for the customer: 1-hour meeting on Fridays, one tech topic at a time, focusing on the most critical and short-term topics.
  8. AWS standby for unplanned asks: Used for code review and dev-related consulting. After the meeting, one developer documented the consulting key points.

Engagement setup changes

  1. Remote Access: We used the customer’s own VPN tool – it took about one week to get it up and running. It was slightly less efficient than onsite direct access.
  2. Collaboration Setup: ZXXM/ running on AWS was chosen for the team’s collaboration platform. We also used Wiki/Jira as the Agile PM tool and to document key project information.
  3. Rotating the customer’s Proxy Project Manager (PjM) role to assist Project Manager: With the rotating proxy PjM(s), the customer PjM could focus more on the responsibility as a chief BA for the project and dependencies on external teams, allowing the proxy PjM(s) to load-balance.
  4. Rapid response for development blockers: No tasks should be self-studied for over 1 hour. Quicker consulting with AWS consultants, triggered when customer developer is blocked on dev task around 1 hour. We used instant messaging to raise requests, and the responses were recorded in Jira/Wiki.

AWS team changes

  1. Added +1 consultant: Because we had to put more effort into remote communication/access and the chief tech consultant’s home city was locked down, we decided to augment the team with one consultant (80% time) to increase AWS capacity.
  2. More focused consultant dedication: One consultant for each microservice team, with one chief tech consultant covering cross-team items.

Result

The first remote sprint velocity was half that of a normal sprint before COVID-19. However, by the end of the third sprint, we recovered completely and managed to reduce two months of expected COVID-19 impact to one sprint time (two weeks) when compared to our pre-COVID-19 schedule.

If you have questions or feedback, contact AWS Professional Services.

Table 1: Enhanced sprint ceremonies for a remote sprint

Newly added remote ceremonies are in bold and underlined.

Enhanced sprint ceremonies for a remote sprint