Optimize S3 Performance
S3 is fast out of the box, but there’s a difference between “works fine” and “handles serious traffic.” This post covers how to push S3 harder without resorting to Transfer Acceleration.
We’ll start with prefixes – the single biggest lever for S3 throughput. Then we’ll dig into the KMS gotchas that catch people off guard, and finish with multipart uploads and byte-range fetches for moving large objects.
Use Prefixes to Scale S3 Throughput
Every object in S3 lives at a path that looks something like this:
bucketName/FolderA/SubB/documentA.pdf
The prefix is everything between the bucket name and the object name. In this case, that’s FolderA/SubB.
Here’s another example:
bucketName/FolderB/documentB.pdf
Prefix: FolderB – no subfolder, just one level deep.
Why prefixes matter for performance
S3 gives you a baseline of 3,500 write requests (PUT, COPY, POST, DELETE) and 5,500 read requests (GET, HEAD) per second, per prefix. First-byte latency sits in the 100-200ms range.
The key word there is “per prefix.” If you spread your objects across multiple prefixes, those limits stack. Two prefixes means 11,000 reads per second. Four prefixes gets you to 22,000.
This isn’t theoretical – it’s how S3 partitions work internally. The service routes requests to different partitions based on the prefix, and each partition handles its own request budget.
Practical example: Say you’re serving images for a high-traffic application. Instead of dumping everything into one prefix like images/, structure it as images/category-a/, images/category-b/, images/category-c/, and so on. Each category gets its own 5,500 GET/s allocation.
The takeaway is straightforward: the more you spread your data across distinct prefixes, the more headroom you have.
The KMS Trap
If you enable SSE-KMS encryption on your bucket, every upload triggers a GenerateDataKey API call to KMS. Every download triggers a Decrypt call. Under load, those KMS calls can become your bottleneck – not S3 itself.
KMS has per-region request quotas for symmetric cryptographic operations. As of 2026, the default quotas are:
- 10,000 requests/second in most regions
- 20,000 requests/second in some regions
- 100,000 requests/second in
us-east-1,us-west-2, andeu-west-1
Unlike when this post was originally written, these quotas are now adjustable – you can request increases through AWS Support or the Service Quotas console if you’re hitting the limit.
S3 Bucket Keys: the easier fix
AWS introduced S3 Bucket Keys to address exactly this problem. When you enable a bucket key, S3 generates a short-lived encryption key locally rather than calling KMS for every single object. This can cut your KMS API calls by up to 99%.
If you need both encryption and performance, enable S3 Bucket Keys. It works with SSE-KMS and reduces your exposure to KMS throttling without weakening your encryption posture.
If you’re troubleshooting mysterious S3 slowdowns and you use SSE-KMS, check your CloudWatch metrics for KMS throttling. It’s a common root cause that doesn’t show up in S3 metrics.
Want more context on encryption options? See the difference between AWS KMS and CloudHSM.
Multipart Uploads and Byte-Range Fetches
For large objects, the default single-request upload/download approach breaks down quickly.
Multipart uploads split a large file into parts and upload them in parallel. AWS recommends multipart for anything over 100 MB and requires it for objects over 5 GB. The benefit is straightforward: parallel transfers mean faster uploads, and if one part fails, you only retry that part – not the entire file.
Byte-range fetches do the same thing for downloads. You request specific byte ranges of an object in parallel, then reassemble them client-side. Beyond speed, this is handy for reading just the header or footer of a large file without downloading the whole thing.
Most AWS SDKs handle multipart uploads automatically when you configure a threshold. The CLI does too – aws s3 cp with large files will use multipart transparently.
Wrapping Up
S3 performance comes down to three things:
- Spread reads across prefixes to multiply your per-second request limits.
- Watch out for KMS throttling if you use SSE-KMS. Enable S3 Bucket Keys to reduce KMS API calls by up to 99%, and remember that KMS quotas are now adjustable.
- Use multipart uploads and byte-range fetches for any object over 100 MB.
If you’re working with KMS in your projects, keep an eye on those quotas. They’ve changed significantly since 2021, and what was once a hard ceiling is now something you can work around.
Comments