Yogitha O

Written by Selvakumar K, Associate Tech Lead — DBA & Kavitha L, Software Developer at Powerupcloud Technologies.

It’s been a couple of weeks, me and my colleague were struggling to get the work together to put learning and solutions for help.

In the beginning, we have written scripts which fortunately worked for Dev and QA Dynamodb Environment but if we look for the real-time scenarios where there could be numerous (say 3 crores) records in the Dynamodb table, the solution would not work. After some days of research, we accomplish a solution using python.

Problem Statement

Retrieve the primary key from the Dynamodb table for the particular policy number and update the dependent items in the excel sheet.

Problems and Limitations in DynamoDB

  1. Dynamodb Read and Write capacity is limited to 20, so we have changed unlimited the provisioned capacity
  2. To perform an update in one shot it’s difficult in case of huge data size. (e.g Compare the policy number from an excel sheet with Dynamodb table). The BatchGetItem operation can retrieve a maximum of 100 items at a time. The total size of all the items retrieved cannot exceed 16 MB
  3. Batch wise update consumes more memory resources so we have increased instance type and updated the items

How does it work?

First, we would read the excel data and convert into the python dictionary.

QA.csv

In the above scenario, each ID has multiple policy information and we are fetching single policy ID from the excel sheet and storing in memory.

If we have more than one policy information we would need to separate and retrieve the policy ID for the update.

Before we begin a comparison of policy number with Dynamodb table, establish the connectivity with DynamoDB.

Comparing the policy number from excel and DynamoDB table to fetch the ID of the DynamoDB table.

Finally, update the records in two batches. First for the ID’s which has more than one policy information and than ID’s which has one policy information.

Updated DynamoDB Table

Conclusion

Frequent Modification of data is very important and it’s required for customer business. Python is a convenient program to automate the update tasks in an easy way. In the above experiment, we have compared two different items and updated the records in the DynamoDB table.

from Powerupcloud Tech Blog – Medium