AWS Lambda Cold Starts: Causes, Measurement, and Solutions

Bits Lovers
Written by Bits Lovers on
AWS Lambda Cold Starts: Causes, Measurement, and Solutions

A Lambda cold start is a tax you pay every time AWS needs to create a new execution environment for your function. For a Python function with minimal dependencies, that tax is 100-400 milliseconds. For a Java function loading Spring Boot, it’s 3-8 seconds. For a container image Lambda running a large JVM application, it can be 10 seconds or more. Whether that matters depends entirely on your use case — an async data processing function that nobody’s waiting on doesn’t care about cold starts. An API endpoint serving user-facing requests absolutely does.

This guide covers exactly what happens during a cold start, how to measure it, and which of the available solutions actually works for your runtime and workload pattern.

What Happens During a Cold Start

Lambda executes in micro-VMs managed by the Firecracker hypervisor. Each execution environment is isolated. When you invoke a function and no warm environment is available, Lambda goes through three phases:

Download phase: Lambda downloads your deployment package (ZIP) or container image to the host. ZIP packages are cached and this step is near-instant for packages already in the regional cache. Container images are larger and the download time scales with image size.

Runtime initialization: Lambda starts the language runtime — the JVM, the Python interpreter, the Node.js process. This is where most of the variation between languages comes from. Python and Node.js start fast. Java’s JVM startup is inherently slow regardless of your code.

Init code execution: Your code outside the handler function runs. Database connections, SDK clients, configuration loading, dependency injection frameworks — everything that runs at module/class level rather than inside the handler itself.

The handler invocation itself isn’t part of the cold start timing. AWS CloudWatch reports Init Duration separately from Duration in Lambda logs. The Init Duration is the cold start. Watch for it:

# Find cold start events in CloudWatch Logs Insights
fields @timestamp, @requestId, @duration, @initDuration, @memorySize
| filter @initDuration > 0
| sort @initDuration desc
| limit 100

@initDuration only appears when an init phase occurred. If a log entry has no @initDuration, it was a warm invocation reusing an existing environment.

Cold Start Times by Runtime

Approximate P50 cold start times for a minimal function with few dependencies, 512 MB memory allocation:

Runtime Typical Init Duration Notes
Python 3.12 100–250 ms Fast; increases significantly with large packages
Node.js 20 100–300 ms Fast; tree shaking reduces package-related init
Go (provided.al2023) 50–150 ms Fastest standard runtime
.NET 8 300–800 ms Better than earlier .NET versions
Java 21 (SnapStart) 100–200 ms SnapStart eliminates most of JVM startup
Java 21 (no SnapStart) 800ms–3s+ JVM startup dominates
Container image (Python) 500ms–2s Depends heavily on image size
Container image (Java) 2–10s Large images with JVM startup

These numbers worsen with dependencies. A Python function importing pandas, numpy, and boto3 will see 500ms+ cold starts. A Node.js function bundling an entire express application adds 200-400ms over minimal cold starts.

Measuring Cold Start Impact

Before optimizing, quantify what you’re actually seeing. Lambda Insights (requires CloudWatch Lambda Insights extension) provides cold start metrics directly in CloudWatch:

# Enable Lambda Insights on an existing function
aws lambda update-function-configuration \
  --function-name my-function \
  --layers "arn:aws:lambda:us-east-1:580247275435:layer:LambdaInsightsExtension:38"

# Get cold start p99 from CloudWatch Metrics
aws cloudwatch get-metric-statistics \
  --namespace LambdaInsights \
  --metric-name init_duration \
  --dimensions Name=function_name,Value=my-function \
  --statistics p99 \
  --period 3600 \
  --start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ)

Also track cold start frequency: cold starts / total invocations. A function invoked thousands of times per minute at steady state has few cold starts relative to warm invocations. A function invoked sporadically — once every few minutes — sees cold starts on nearly every invocation. The frequency matters as much as the duration.

Fix 1: Initialize Outside the Handler

The highest-leverage change for any runtime: move expensive initialization outside the handler. Lambda execution environments are reused across multiple invocations. Code outside the handler runs once during the init phase and is available for all subsequent warm invocations:

import boto3
import json

# This runs ONCE during init — cached for warm invocations
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('my-table')

# Load config once instead of every invocation
ssm = boto3.client('ssm')
config = json.loads(
    ssm.get_parameter(Name='/myapp/config', WithDecryption=True)
       ['Parameter']['Value']
)

def handler(event, context):
    # table and config are already initialized — no overhead here
    item = table.get_item(Key={'id': event['id']})
    return item['Item']

The common mistake: putting boto3 client initialization inside the handler. Every invocation (warm and cold) pays the initialization cost. Moving it outside means cold starts are slightly slower, but warm invocations run significantly faster.

Fix 2: Lambda SnapStart for Java

SnapStart, launched in 2022, eliminates most of the JVM cold start overhead for Java functions. The mechanism: Lambda initializes your function’s execution environment, takes a snapshot of the memory and disk state after init completes, and stores it. Subsequent cold starts restore from the snapshot instead of running the init phase fresh.

Enable SnapStart on a Java function:

# Create or update function with SnapStart enabled
aws lambda create-function \
  --function-name my-java-function \
  --runtime java21 \
  --handler com.example.Handler::handleRequest \
  --role arn:aws:iam::123456789012:role/lambda-role \
  --zip-file fileb://function.zip \
  --snap-start '{"ApplyOn":"PublishedVersions"}' \
  --memory-size 512

# SnapStart requires publishing a version
aws lambda publish-version --function-name my-java-function

# Create alias pointing to the version (aliases use SnapStart)
aws lambda create-alias \
  --function-name my-java-function \
  --name prod \
  --function-version 1

SnapStart only works with published versions and aliases — it doesn’t apply to $LATEST. Point your invocations at the alias, not $LATEST.

One caveat with SnapStart: code that generates unique IDs or timestamps during init will produce the same values for all instances restored from the same snapshot. Add a @AfterRestore hook if your init code generates unique state:

import com.amazonaws.services.lambda.crac.*;

public class Handler implements CracInterface {
    private String instanceId;
    
    @Override
    public void beforeCheckpoint(Context<? extends Resource> context) {
        // Clear state that shouldn't be snapshotted
        instanceId = null;
    }
    
    @Override
    public void afterRestore(Context<? extends Resource> context) {
        // Re-initialize per-instance state after restore
        instanceId = UUID.randomUUID().toString();
    }
}

Fix 3: Provisioned Concurrency

Provisioned Concurrency keeps a specified number of execution environments pre-initialized and ready. Cold starts disappear for provisioned instances — invocations that hit a provisioned environment have zero init duration.

# Set provisioned concurrency on an alias
aws lambda put-provisioned-concurrency-config \
  --function-name my-function \
  --qualifier prod \
  --provisioned-concurrent-executions 10

# Check status (takes 1-2 minutes to provision)
aws lambda get-provisioned-concurrency-config \
  --function-name my-function \
  --qualifier prod

Ten provisioned instances means the first ten concurrent invocations are cold-start-free. The eleventh concurrent invocation starts a new environment (cold start).

Provisioned Concurrency costs $0.000064546 per GB-second, compared to $0.0000166667/GB-second for regular invocations. For a 512 MB function with 10 provisioned instances running 24/7: 10 × 0.5 GB × 86,400 seconds/day × $0.000064546 ≈ $27.90/month in provisioned costs, regardless of invocation volume.

Use Application Auto Scaling to scale provisioned concurrency based on time of day:

# Register the function as a scalable target
aws application-autoscaling register-scalable-target \
  --service-namespace lambda \
  --resource-id function:my-function:prod \
  --scalable-dimension lambda:function:ProvisionedConcurrency \
  --min-capacity 2 \
  --max-capacity 20

# Scale to 10 during business hours, 2 overnight
aws application-autoscaling put-scheduled-action \
  --service-namespace lambda \
  --resource-id function:my-function:prod \
  --scalable-dimension lambda:function:ProvisionedConcurrency \
  --scheduled-action-name "business-hours" \
  --schedule "cron(0 8 * * ? *)" \
  --scalable-target-action MinCapacity=10,MaxCapacity=10

Fix 4: ARM64 and Memory Tuning with Lambda Power Tuning

Switching to ARM64 (Graviton2) typically reduces cold start times 5-15% and lowers cost ~20% for the same memory allocation. It’s a one-line change:

aws lambda update-function-configuration \
  --function-name my-function \
  --architectures arm64

Works for Python, Node.js, Go, and container images built for arm64. Java on Graviton has more complex trade-offs — test before switching.

Memory configuration affects cold start times because more memory means more CPU. A function at 128 MB gets one vCPU fraction; at 1,792 MB it gets a full vCPU; beyond that, additional CPU cores. Lambda Power Tuning finds the optimal memory setting automatically:

# Deploy Lambda Power Tuning (one-time setup)
sam deploy \
  --template-url https://s3.amazonaws.com/lambda-power-tuning/latest/template.yml \
  --stack-name lambda-power-tuning \
  --capabilities CAPABILITY_IAM

# Start a tuning state machine execution
aws stepfunctions start-execution \
  --state-machine-arn arn:aws:states:us-east-1:123456789012:stateMachine:powerTuningStateMachine \
  --input '{
    "lambdaARN": "arn:aws:lambda:us-east-1:123456789012:function:my-function",
    "powerValues": [128, 256, 512, 1024, 2048, 3008],
    "num": 10,
    "payload": {"test": "payload"},
    "parallelInvocation": true,
    "strategy": "cost"
  }'

Power Tuning invokes your function at each memory level, measures duration and cost, and produces a visualization showing the optimal setting for cost, speed, or a balanced trade-off. Most functions have a “sweet spot” where increasing memory reduces duration enough to lower total cost.

Keeping Package Size Small

Every MB in your deployment package adds to cold start download time. For Node.js, bundle with esbuild or webpack and tree-shake unused imports:

# Bundle a Node.js Lambda with esbuild
esbuild src/handler.ts \
  --bundle \
  --platform node \
  --target node20 \
  --external:@aws-sdk/* \
  --minify \
  --outfile dist/handler.js

# Zip only the bundle (not node_modules)
zip -j function.zip dist/handler.js

The --external:@aws-sdk/* flag excludes the AWS SDK from the bundle since Lambda provides it. The bundled output for a typical API handler might be 50-100 KB instead of 10-30 MB for an unbundled node_modules folder.

For Python, use Lambda Layers to share large dependencies (pandas, numpy) across functions rather than including them in every deployment package. The Lambda Layers guide covers the layer creation and attachment workflow.

Cold starts are solvable at every level of severity. The right solution depends on your runtime, traffic pattern, and latency requirements. Java on SnapStart with Provisioned Concurrency handles the most demanding use cases. Python or Node.js with careful package management and module-level initialization handles most API use cases with no additional cost.

Bits Lovers

Bits Lovers

Professional writer and blogger. Focus on Cloud Computing.

Comments

comments powered by Disqus