CRUD with Python and DynamoDB: A Step-by-Step Guide

Written by Bits Lovers on 12 Apr 2023

CRUD with Python and DynamoDB: A Step-by-Step Guide

DynamoDB is a NoSQL database from Amazon that handles scale without breaking a sweat. If you need a database that grows with your app and never makes you fiddle with servers, it’s worth knowing.

I’ll show you how to do CRUD with Python using Boto3, the official AWS SDK.

Step 1: Setting up your environment

Here’s what you need before writing any code:

Create an AWS account if you don’t have one yet.
Create an IAM user with programmatic access. Attach the AmazonDynamoDBFullAccess policy.
Install Boto3:

pip install boto3

Create a DynamoDB table with a primary key called id (String type). If you’re using the AWS console, it takes about 30 seconds.
Configure your credentials. The cleanest way is through environment variables or the AWS credentials file, like this:

export AWS_ACCESS_KEY_ID=your_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1

Don’t put keys directly in your code. Use environment variables or AWS profiles instead.

Step 2: Writing the code

Let’s build a file called crud.py. First, the imports:


import boto3
from boto3.dynamodb.conditions import Key, Attr

Now set up the DynamoDB resource. The recommended way is to let Boto3 pick up your credentials from the environment or ~/.aws/credentials:


dynamodb = boto3.resource('dynamodb')

If you need to specify a region explicitly:


dynamodb = boto3.resource('dynamodb', region_name='us-east-1')

Once that’s done, let’s write the CRUD functions.

Create


def create_item(table_name, item):
    table = dynamodb.Table(table_name)
    response = table.put_item(Item=item)
    return response

put_item inserts the item. If an item with the same key already exists, it gets overwritten. The response tells you whether the operation succeeded.

Read


def get_item(table_name, item_id):
    table = dynamodb.Table(table_name)
    response = table.get_item(Key={'id': item_id})
    return response.get('Item')

Pass the partition key to get_item. This returns the item itself (or None if nothing matches). Note that get_item only works with the primary key – you can’t filter by other attributes here.

Update


def update_item(table_name, item_id, update_expression, expression_values):
    table = dynamodb.Table(table_name)
    response = table.update_item(
        Key={'id': item_id},
        UpdateExpression=update_expression,
        ExpressionAttributeValues=expression_values,
        ReturnValues="UPDATED_NEW"
    )
    return response.get('Attributes')

update_item modifies an existing item. The UpdateExpression tells DynamoDB what to change, and ExpressionAttributeValues provides the values. The ReturnValues parameter set to "UPDATED_NEW" gives you back only the attributes that changed.

Delete


def delete_item(table_name, item_id):
    table = dynamodb.Table(table_name)
    response = table.delete_item(Key={'id': item_id})
    return response

delete_item removes an item by its primary key.

Step 3: Testing the code

Here’s a quick test script:


TABLE_NAME = 'my-table-name'

item = {
    'id': '1',
    'name': 'John',
    'age': 30
}

# Create
response = create_item(TABLE_NAME, item)
print(response)

# Read
retrieved = get_item(TABLE_NAME, '1')
print(retrieved)

# Update
update_expression = 'SET age = :val1'
expression_values = {':val1': 35}
updated = update_item(TABLE_NAME, '1', update_expression, expression_values)
print(updated)

# Delete
response = delete_item(TABLE_NAME, '1')
print(response)

This creates an item with id='1', reads it back, updates the age field from 30 to 35, then deletes it.

Best practices

A few things worth keeping in mind when working with DynamoDB:

Pick a good partition key. Uneven distribution leads to hot partitions and throttling. Your key should have lots of distinct values.
Use batch operations when you can. batch_writer() handles multiple writes in one call and retries unprocessed items automatically.
Limit what you retrieve. Use ProjectionExpression to fetch only the attributes you actually need.
Use conditional expressions for writes. This ensures atomic updates and prevents accidental overwrites.
Avoid scans on large tables. Scans read every item. Use queries against the primary key or add a Global Secondary Index instead.
Paginate through large result sets. Both scan() and query() return paginated results. Use LastEvaluatedKey to walk through all pages.