CRUD with Python and DynamoDB: A Step-by-Step Guide
DynamoDB is a NoSQL database from Amazon that handles scale without breaking a sweat. If you need a database that grows with your app and never makes you fiddle with servers, it’s worth knowing.
I’ll show you how to do CRUD with Python using Boto3, the official AWS SDK.
Step 1: Setting up your environment
Here’s what you need before writing any code:
- Create an AWS account if you don’t have one yet.
- Create an IAM user with programmatic access. Attach the AmazonDynamoDBFullAccess policy.
- Install Boto3:
pip install boto3
- Create a DynamoDB table with a primary key called
id(String type). If you’re using the AWS console, it takes about 30 seconds. - Configure your credentials. The cleanest way is through environment variables or the AWS credentials file, like this:
export AWS_ACCESS_KEY_ID=your_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1
Don’t put keys directly in your code. Use environment variables or AWS profiles instead.
Step 2: Writing the code
Let’s build a file called crud.py. First, the imports:
import boto3
from boto3.dynamodb.conditions import Key, Attr
Now set up the DynamoDB resource. The recommended way is to let Boto3 pick up your credentials from the environment or ~/.aws/credentials:
dynamodb = boto3.resource('dynamodb')
If you need to specify a region explicitly:
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
Once that’s done, let’s write the CRUD functions.
Create
def create_item(table_name, item):
table = dynamodb.Table(table_name)
response = table.put_item(Item=item)
return response
put_item inserts the item. If an item with the same key already exists, it gets overwritten. The response tells you whether the operation succeeded.
Read
def get_item(table_name, item_id):
table = dynamodb.Table(table_name)
response = table.get_item(Key={'id': item_id})
return response.get('Item')
Pass the partition key to get_item. This returns the item itself (or None if nothing matches). Note that get_item only works with the primary key – you can’t filter by other attributes here.
Update
def update_item(table_name, item_id, update_expression, expression_values):
table = dynamodb.Table(table_name)
response = table.update_item(
Key={'id': item_id},
UpdateExpression=update_expression,
ExpressionAttributeValues=expression_values,
ReturnValues="UPDATED_NEW"
)
return response.get('Attributes')
update_item modifies an existing item. The UpdateExpression tells DynamoDB what to change, and ExpressionAttributeValues provides the values. The ReturnValues parameter set to "UPDATED_NEW" gives you back only the attributes that changed.
Delete
def delete_item(table_name, item_id):
table = dynamodb.Table(table_name)
response = table.delete_item(Key={'id': item_id})
return response
delete_item removes an item by its primary key.
Step 3: Testing the code
Here’s a quick test script:
TABLE_NAME = 'my-table-name'
item = {
'id': '1',
'name': 'John',
'age': 30
}
# Create
response = create_item(TABLE_NAME, item)
print(response)
# Read
retrieved = get_item(TABLE_NAME, '1')
print(retrieved)
# Update
update_expression = 'SET age = :val1'
expression_values = {':val1': 35}
updated = update_item(TABLE_NAME, '1', update_expression, expression_values)
print(updated)
# Delete
response = delete_item(TABLE_NAME, '1')
print(response)
This creates an item with id='1', reads it back, updates the age field from 30 to 35, then deletes it.
Best practices
A few things worth keeping in mind when working with DynamoDB:
- Pick a good partition key. Uneven distribution leads to hot partitions and throttling. Your key should have lots of distinct values.
- Use batch operations when you can.
batch_writer()handles multiple writes in one call and retries unprocessed items automatically. - Limit what you retrieve. Use
ProjectionExpressionto fetch only the attributes you actually need. - Use conditional expressions for writes. This ensures atomic updates and prevents accidental overwrites.
- Avoid scans on large tables. Scans read every item. Use queries against the primary key or add a Global Secondary Index instead.
- Paginate through large result sets. Both
scan()andquery()return paginated results. UseLastEvaluatedKeyto walk through all pages.
That’s the basics. You can now build a simple CRUD layer with Python and DynamoDB.
Comments