Amazon Keyspaces vs. Timestream: A Cost-Driven Decision Guide That Actually Helps

Bits Lovers
Written by Bits Lovers on
Amazon Keyspaces vs. Timestream: A Cost-Driven Decision Guide That Actually Helps

Here’s the thing about picking between Keyspaces and Timestream: the marketing pages make them look like they’re in the same category. They’re both “managed databases on AWS.” They both scale. They both have names that sound vaguely similar if you squint.

But they solve fundamentally different problems, and picking the wrong one will cost you money you didn’t budget for, performance you can’t tune your way out of, or both.

Let me cut through the noise and give you the decision framework I wish I’d had before making this choice for a production system.

What You’re Actually Choosing Between

Amazon Keyspaces is a managed Apache Cassandra. You get a CQL (Cassandra Query Language) API, horizontal scaling, multi-AZ replication, and serverless capacity options. It’s designed for workloads that Cassandra is good at: high write throughput, wide-column data models, and applications where you need to scale to billions of rows without redesigning your schema.

Amazon Timestream is a managed time-series database. You get SQL-compatible queries, automatic data tiering between memory and magnetic storage, and query optimization for timestamp-ordered data. It’s designed for IoT sensor data, application metrics, log events — anything where time is the primary axis of your queries and you want to keep recent data fast while archiving historical data cheaply.

The catchphrase version: Keyspaces is for when you need Cassandra without managing Cassandra. Timestream is for when you need time-series analytics without building time-series analytics infrastructure.

The Cost Model That Determines Everything

This is where most people get surprised. Both services have on-demand pricing that sounds reasonable until you run the numbers at scale.

Keyspaces Cost Model

Keyspaces charges per request (per operation) plus storage. On-demand mode:

  • Write: $1.25 per million write units
  • Read: $0.25 per million read units
  • Storage: $0.25 per GB-month

A “write unit” is 1KB. If you’re writing 100-byte records, each record is 0.1 write units. If you’re writing 10KB records, that’s 10 write units each.

The problem: high-volume Cassandra workloads can produce tens of millions of operations per day. At 50 million writes/day, you’re looking at roughly $62.50/day just in on-demand write costs, or about $1,875/month before storage. Provisioned capacity can reduce this — you reserve capacity ahead of time at lower rates — but then you’re paying for capacity you might not use.

Here’s a real scenario I worked with: a customer migrated a Cassandra workload to Keyspaces expecting 30% cost savings. Their existing Cassandra cluster processed about 800 million writes/day. The math: 800M writes at $1.25/M = $1,000/day = $30,000/month. Their existing infrastructure cost $18,000/month. The managed service was more expensive at their scale.

Provisioned capacity helped: if they reserved enough capacity for their average load with auto-scaling for peaks, they got down to roughly $12,000/month. Still more than self-managed. But with zero operational overhead.

Timestream Cost Model

Timestream has two tiers that matter: memory store and magnetic store.

  • Memory store (recent data, configurable retention): $0.036 per GB written, $0.012 per GB scanned
  • Magnetic store (older data): $0.012 per GB written, $0.006 per GB scanned
  • Query: $0.01 per GB scanned

The magnetic store is roughly 3x cheaper per GB for both writes and scans. The tradeoff: magnetic store queries are slower (this is cold storage, not cache).

For a typical IoT workload — 1 billion events/day, 100 bytes each, 30-day hot retention, 1-year cold retention — the math roughly:

  • 1B events/day × 100 bytes = 100 GB/day written
  • Memory store (30 days): 3,000 GB × $0.036 = $108/day
  • Magnetic store (335 days): 33,500 GB × $0.012 = $402/day
  • Total writes: ~$510/day = ~$15,300/month

Plus query costs. Timestream queries scan data. An inefficient query that scans 1TB of data costs $10. If your analysts run 50 poorly-written queries/day, that’s $500/day in query costs alone. This is the hidden bill that surprises people.

The Crossover Point

For Keyspaces: if you’re processing fewer than about 5 million writes/day with modest storage needs, on-demand Keyspaces is cost-competitive with self-managed Cassandra. Above that, run the numbers carefully. Provisioned capacity with auto-scaling shifts the crossover point.

For Timestream: if your time-series data is genuinely write-heavy and your query patterns are predictable (dashboard refreshes, scheduled reports), Timestream is usually cheaper than running a self-managed InfluxDB or ClickHouse cluster. If you’re running many ad-hoc analytical queries, query costs can spiral.

Write-Heavy vs. Query-Heavy: The Decision Matrix

The most important question to ask: what’s your ratio of writes to reads?

Keyspaces excels at write-heavy workloads. If you’re writing more than you’re reading — event ingestion, sensor data collection, clickstream logging — and you need horizontal write scalability without partition hotspots, Keyspaces is built for this. The Cassandra-compatible storage engine handles millions of writes per second across a distributed cluster.

Timestream is built for query-heavy time-series analysis. If you’re writing data once (or infrequently updating records with new timestamps) and then running many analytical queries against it — dashboards, aggregation reports, anomaly detection — Timestream’s query optimization pays off. The automatic tiering means recent data is fast, and the SQL interface means analysts can write queries without learning a new query language.

A workload that writes user actions and then queries aggregates over time windows: Timestream wins. A workload that writes sensor readings and then needs to retrieve individual sensor histories by partition key: Keyspaces wins.

Migration: The Part Nobody Talks About

Here’s what the comparison pages don’t tell you: migrations are painful for both services, but for completely different reasons.

Migrating to Keyspaces from Cassandra

If you’re migrating from self-managed Cassandra, the good news is that the CQL API is the same. Your existing drivers, your existing query patterns, your data model (wide rows, composite partition keys) all transfer.

The bad news: Keyspaces has some Cassandra features that aren’t supported. Materialized views, user-defined functions, and stored procedures don’t exist in Keyspaces. If your schema relies on these, you need to rearchitect those patterns.

The real migration cost is in the data movement itself. For a 10TB dataset:

  • AWS DMS (Database Migration Service) supports Cassandra as a source. It handles the initial load and then replicates ongoing changes.
  • For larger datasets, you might use Spark with the Cassandra Spark connector to bulk-migrate, then switch to DMS for change data capture.

The gotcha I see constantly: Keyspaces on-demand capacity and provisioned capacity have different consistency characteristics. On-demand uses eventually consistent reads by default (you can request strongly consistent reads at 2x the cost). If your Cassandra application relies on strongly consistent reads, this breaks your assumptions.

Migrating to Timestream

Timestream is not a general-purpose database. If you’re trying to migrate from MySQL or PostgreSQL, Timestream is almost certainly the wrong choice. It doesn’t support complex joins outside of time-series patterns. It doesn’t do transactional writes across multiple tables well. It doesn’t have foreign key constraints.

The migration path looks like:

  1. Extract your time-series data from the source system
  2. Identify the timestamp column and any dimensions (tags)
  3. Load into Timestream using the SDK, Kinesis Data Firehose, or the data loader

The most common migration mistake I see: treating Timestream like a regular relational database and trying to model complex relationships. If you have a users table and an events table and you want to join them to get “events for user X,” that’s not a time-series pattern. Timestream handles it, but badly. In that case, keep users in RDS or DynamoDB and write user IDs into Timestream as dimensions.

Migration from Keyspaces to Timestream (or vice versa)

There is no direct migration path. These are different data models, different query languages, different everything. If you’re moving from one to the other, you’re doing an ETL: extract the data, transform it into the target schema, load it into the new service.

This is why the initial decision matters so much. Getting it wrong means a full data migration project that could have been avoided.

The Failure Modes That Kill Performance

Keyspaces: Partition Hotspots

Cassandra’s Achilles heel is partition hotspots. If all your writes go to the same partition because your partition key is wrong, one node bears all the load and the rest sit idle. Keyspaces doesn’t escape this — it just surfaces the problem differently.

You can monitor per-partition traffic in Keyspaces CloudWatch metrics. If you see one partition consuming disproportionate throughput units, you need to redesign your partition key.

The fix is never cheap. It means reingesting your data with a new partition key. For a production system with billions of rows, that’s a migration project.

Prevention: design your partition key with traffic patterns in mind, not just data structure. A common mistake is partitioning by user ID (all of user X’s events on the same partition) when in practice you partition by time bucket + user ID (user X’s events for day D are on partition D-X). This distributes load across partitions even when one user generates more traffic than others.

Timestream: Query Pruning Failures

Timestream stores data in time-ordered partitions. If your queries don’t filter by time, Timestream scans your entire dataset. For a 3-year dataset with 100GB/day of writes, that’s 100TB scanned per query. At $0.006/GB for magnetic store scans, that’s $600 per query.

The fix is mandatory: always include a time range filter in your WHERE clause.

-- WRONG: Scans entire dataset
SELECT * FROM sensor_data WHERE sensor_id = 'temp-01'

-- RIGHT: Scans only the time window
SELECT * FROM sensor_data
WHERE sensor_id = 'temp-01'
AND time BETWEEN '2026-01-01' AND '2026-01-31'

Timestream will warn you if a query doesn’t include a time filter, but it won’t block it. In on-demand query mode, a single careless query can cost hundreds of dollars.

The Decision Framework

Here’s how I think about this decision:

Choose Keyspaces if:

  • You’re replacing or migrating an existing Cassandra workload
  • You need wide-column data modeling (rows with hundreds of columns, sparse data)
  • You need to query by partition key and get fast point reads for specific records
  • You need multi-row transactions (Keyspaces supports lightweight transactions)
  • Your team already knows CQL

Choose Timestream if:

  • Your primary access pattern is “give me all events in time range X”
  • You’re storing IoT sensor data, application metrics, logs, or clickstream
  • You want automatic data lifecycle management (hot → cold tiering)
  • You want analysts to write SQL without learning CQL
  • You need window functions, aggregation over time periods, or interpolation

Choose neither if:

  • You need complex multi-table joins
  • Your data model has many-to-many relationships
  • You’re doing transactional operations across multiple record types
  • Your team doesn’t have experience with either Cassandra or time-series patterns

Run the cost model first. Before you commit to either service, model your actual workload. For Keyspaces: estimate your daily write volume in KB, multiply by 30, multiply by $1.25/M. For Timestream: estimate your daily write volume in GB, multiply by $0.036 (hot) and $0.012 (cold) for writes, add estimated query scans in GB × $0.01. Compare to your self-managed alternative.

The numbers usually tell the story more clearly than the feature comparisons.

The Honest Verdict

Both services are solid managed database options for their target use cases. The mistake isn’t choosing either one — it’s treating them as interchangeable or trying to force a square workload into a round service.

Keyspaces is expensive at scale compared to self-managed Cassandra, but the operational savings are real if you don’t have Cassandra expertise. Timestream is cost-effective for time-series analytics if you control your query patterns, and punishing if you don’t.

If you’re starting a new project today with no existing constraints: choose based on your actual access patterns, not the feature matrix. If you’re migrating from an existing system: measure your current cost and workload characteristics before committing.

The worst outcome is building your entire data architecture around the wrong service and discovering the problem at 3am when your billing alert fires.

For more on AWS databases, Aurora vs RDS covers managed SQL options, and cloud migration covers moving existing databases to cloud-native services. The FinOps guide covers cost modeling for AWS database services at scale.

Bits Lovers

Bits Lovers

Professional writer and blogger. Focus on Cloud Computing.

Comments

comments powered by Disqus