Horizontal vs Vertical Scaling in AWS
There are two ways to scale in AWS: vertical and horizontal. I want to start with vertical scaling because it’s the approach most of us learned first. Then we’ll get into horizontal scaling, which is what you probably want in production.
Studying for an AWS Certification? I have three questions at the end that come up on the exam.
Vertical Scaling
If you’ve worked in traditional IT, vertical scaling is what you already know. You keep your infrastructure as-is. You just buy a bigger machine.
Your code stays the same. You deploy the same application on hardware with better specs – faster CPU, more RAM, bigger disks. That’s the whole idea. Data lives on a single node, and you handle growth by upgrading that node.

The problem? There’s always a ceiling. You can only stuff so much hardware into one box. Once you max out, you’re done. It’s also a single point of failure. That node goes down? Everything goes down with it. And resizing an EC2 instance usually means stopping it first, so there’s a small outage baked in.
I’m not saying vertical scaling is useless. Databases, for instance, often run better on one big instance than spread across several smaller ones. But a t3.micro (or the older t2.micro) has no business running a production database. Some workloads just need to scale up, and that’s fine.
What I want you to think about, though, is that for most cloud architectures, vertical scaling isn’t the goal. Horizontal scaling is where things get interesting.
Horizontal Scaling
Horizontal scaling is different. Instead of upgrading one machine, you add more machines. Same specs. Just more of them. A load balancer sits in front and spreads the traffic around.
The big advantage: no hard ceiling. Traffic doubles? Add instances. Traffic triples? Add more. One instance crashes? The others keep serving requests. You get fault tolerance almost for free.
But it’s not all simple. Your application has to deal with the fact that any given request might land on any instance. State management becomes your problem. Sessions, caches, data consistency – you have to think about all of it. This is why stateless applications pair so well with horizontal scaling.
Now, three questions worth memorizing. They’ll help you understand Auto Scaling on AWS, and they will show up on the exam.
What are we scaling? Is this an EC2 instance? A database? How do new instances get created, and where does the template come from?
Where are we scaling? Which VPC? How many availability zones? Which load balancer do these instances attach to? Getting the network topology right matters a lot.
When do we scale? We need metrics and thresholds. If we decide to scale a specific EC2 instance across two AZs, we need CloudWatch watching our CPU or request count, firing alarms when things get hot – and scaling back down when traffic drops.
In the AWS ecosystem, CloudWatch alarms drive almost all scaling decisions. EC2 instances, database replicas, even Lambda concurrency – CloudWatch is the trigger. On the certification exam, when you see a scaling question, reach for CloudWatch first.
Horizontal Scaling vs Vertical Scaling in AWS
Here are the AWS services worth learning if you want to build horizontally scalable systems:
1 - Launch Templates – this is what you should use today to define instance configuration. Launch Configurations were deprecated at the end of 2023, so don’t use those anymore.
2 - Auto Scaling Groups – these manage your fleet. They maintain desired capacity and swap out unhealthy instances on their own.
3 - Load Balancer (Application Load Balancer or Network Load Balancer) – spreads traffic across your instances. Skip Classic Load Balancer; AWS is retiring it. Use ALB for HTTP/HTTPS or NLB for TCP/UDP.
4 - Target Groups – these sit behind your load balancer and route traffic to healthy instances.
5 - API Gateway – handy if you’re scaling serverless or container-based backends.
6 - EKS (running multiple workloads) – for containerized apps where you need to scale pods and nodes separately.
Comments