Amazon EKS Auto Mode in Production: What AWS Manages and What You Still Own
AWS announced Amazon EKS Auto Mode on December 1, 2024. The deeper “under the hood” explanation followed on March 31, 2025. On February 10, 2026, AWS added CloudWatch Vended Logs support for Auto Mode’s managed capabilities. By April 10, 2026, the interesting question is no longer whether Auto Mode is real. It is whether Auto Mode is the right production operating model for your cluster.
That distinction matters. Plenty of Kubernetes features look attractive during cluster creation and become annoying during incident response. Auto Mode is better than that, but it still has a clear opinion about how your platform should run. If that opinion matches your team, Auto Mode removes a lot of low-value work. If it does not, it removes control you still need.
What Auto Mode Actually Takes Over
The current EKS user guide is very explicit here. Auto Mode does not just provision worker nodes. AWS manages a larger slice of the data plane than standard EKS mode:
- compute autoscaling
- pod and service networking
- load balancing integration
- block storage drivers
- node lifecycle, patching, and replacement
AWS also documents some strong defaults that shape operations:
- immutable node AMIs
- SELinux-enabled nodes
- read-only root file systems
- no SSH or SSM access to Auto Mode nodes
- a maximum node lifetime of 21 days
That is a real operating-model shift. If you are coming from EKS getting started or a more explicit Karpenter autoscaling setup, Auto Mode is not “Karpenter but easier.” It is AWS deciding that a production node should be treated more like an appliance than a pet.
When I Would Use It
I would seriously consider Auto Mode for:
- application teams that want Kubernetes without becoming node-management experts
- platform teams that are understaffed relative to cluster count
- environments with mostly standard Linux workloads
- greenfield EKS platforms where opinionated defaults are an advantage
I would be cautious if the cluster depends on:
- direct node access for debugging or custom host configuration
- unusual storage migration requirements
- highly customized load balancer behavior
- node-level software that assumes a mutable host
- platform teams that already get strong results from explicit Karpenter plus custom node classes
That last point matters. Auto Mode is not automatically better than EKS Karpenter autoscaling. It is better when you want AWS to own more of the boring but fragile parts.
The Production Baseline I Would Set First
Before any real workload lands on Auto Mode, I want four things settled.
1. IAM and cluster permissions
If you enable Auto Mode on an existing cluster, the current docs say the cluster IAM role needs additional managed policies attached:
AmazonEKSComputePolicyAmazonEKSBlockStoragePolicyAmazonEKSLoadBalancingPolicyAmazonEKSNetworkingPolicyAmazonEKSClusterPolicy
That is not optional paperwork. If you skip it, the platform looks broken when it is really under-permissioned.
2. Networking assumptions
Auto Mode manages pod networking, but it does not absolve you from VPC design. Subnet layout, route tables, CIDR planning, egress design, and private connectivity are still your problem. The EKS networking guide is still relevant because Auto Mode simplifies controller ownership, not network architecture.
3. A deliberate NodeClass
The default path is fine for experiments. Production deserves an explicit NodeClass so subnet selection, security groups, storage, public IP behavior, and logging defaults are visible in Git:
apiVersion: eks.amazonaws.com/v1
kind: NodeClass
metadata:
name: production-private
spec:
role: AmazonEKSAutoNodeRole
subnetSelectorTerms:
- tags:
Name: "private-subnet"
kubernetes.io/role/internal-elb: "1"
securityGroupSelectorTerms:
- tags:
Name: "eks-cluster-sg"
networkPolicy: DefaultDeny
networkPolicyEventLogs: Enabled
ephemeralStorage:
size: "120Gi"
iops: 3000
throughput: 125
advancedNetworking:
associatePublicIPAddress: false
advancedSecurity:
fips: false
tags:
Environment: "production"
Team: "platform"
One production nuance from the docs is easy to miss: if you create a custom NodeClass, you also need an EKS access entry for the node IAM role using access-entry type EC2 and the AmazonEKSAutoNodePolicy. The built-in NodeClass path hides this from you. The custom path does not.
4. A deliberate NodePool
NodePool is where you encode compute policy, not just scheduling convenience:
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: apps-ondemand
spec:
template:
metadata:
labels:
workload-tier: app
spec:
nodeClassRef:
group: eks.amazonaws.com
kind: NodeClass
name: production-private
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
- key: eks.amazonaws.com/instance-category
operator: In
values: ["c", "m", "r"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
expireAfter: 168h
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
budgets:
- nodes: 10%
limits:
cpu: "500"
memory: 1000Gi
The current NodePool docs are worth reading carefully. By default, Auto Mode consolidates underutilized instances, expires instances after 336 hours, and sets a disruption budget of 10% of nodes. If you do not set expectations for that behavior, the first “why did this node rotate?” question will surprise people who thought Auto Mode was only about scaling.
The Built-In NodePools Are Opinionated Too
Auto Mode includes built-in system and general-purpose NodePools. You cannot modify them, only enable or disable them. They are useful, but they are not neutral.
For example:
systemis for cluster-critical workloads and uses aCriticalAddonsOnlytaintgeneral-purposehandles regular workloads- both built-ins use on-demand only
That is fine for a first cluster. It is not enough for every production environment. If you plan to separate cost-sensitive workers, GPU jobs, zone-local pools, or ARM64 workloads, move quickly to explicit custom NodePools.
One more nuance from the docs: if you create a cluster without built-in NodePools, the default NodeClass is not created for you. That is good when you want full control, but it means the production design work moves to day zero rather than week three.
Observability Is Better Than It Was
This is where Auto Mode got more credible in 2026.
On February 10, 2026, AWS added CloudWatch Vended Logs support for Auto Mode managed components. The current docs split observability into two buckets:
- control plane logs
- managed component logs for compute autoscaling, EBS CSI, load balancing, and VPC CNI IPAM
That separation matters because enabling control plane logs does not automatically give you the component logs that explain Auto Mode behavior. If you want real troubleshooting coverage, configure both. Managed component logs can now be delivered to CloudWatch Logs, Amazon S3, or Amazon Kinesis Data Firehose.
That pairs naturally with CloudWatch Container Insights on EKS. Container Insights still matters for workload telemetry. Auto Mode managed component logs matter for platform telemetry. They are different layers and you want both.
Incident Response Works Differently
The biggest operational trade-off is node access.
AWS documents that you cannot directly access Auto Mode EC2 managed instances, including by SSH. That is a feature from a security and fleet-management perspective, but it changes how your team debugs production problems.
The supported paths now are:
NodeDiagnosticresourceskubectl-based debug containers- EC2
get-console-output - CloudWatch-delivered component logs
If your incident culture still depends on “SSH to the node and poke around,” Auto Mode will feel restrictive. If your team already prefers Kubernetes-native debugging and centralized logs, the shift is much easier.
Migration Traps You Should Assume Are Real
The migration docs are refreshingly blunt, and that is a good thing.
EBS volume migration is not seamless.
AWS explicitly says migrating volumes from the standard EBS CSI controller to the EKS Auto Mode EBS CSI controller is not supported as a lift-and-shift. The storage classes use different provisioners: ebs.csi.aws.com versus ebs.csi.eks.amazonaws.com. There is an AWS Labs migration tool, but the docs also warn that the migration requires deleting and recreating PVC and PV resources. Test that in non-production first.
Load balancer migration is not seamless either.
AWS also says migrating load balancers from the AWS Load Balancer Controller to EKS Auto Mode is not supported as a direct migration path. If your cluster has a large ALB/NLB footprint, plan that move as a service migration, not a checkbox.
Disabling Auto Mode is destructive.
The current docs say turning Auto Mode off terminates Auto Mode EC2 instances and deletes Auto Mode-managed load balancers. It does not delete EBS volumes. That is not a reversible toggle you try casually in production.
Those are exactly the kinds of details that separate “managed” from “safe to migrate without planning.” Do the planning.
How I Would Roll It Out
I would not start with the busiest cluster. I would use this order:
- New non-critical environment with explicit NodeClass and NodePool manifests in Git.
- Enable control plane logs and Auto Mode managed component logs on day one.
- Migrate stateless services first.
- Keep delivery boring and deterministic through ArgoCD on EKS or your existing GitOps workflow.
- Move storage-heavy and ingress-heavy services only after you have proven your migration playbook.
If your application packaging is still inconsistent, clean that up before the platform move. Helm charts on EKS is still the right discipline for repeatable workload deployment even when AWS manages more of the cluster internals.
Final Take
Amazon EKS Auto Mode is good when you want Kubernetes as a productized platform, not as a collection of node-level tuning opportunities. It reduces a lot of fragile work. It also assumes you are willing to give AWS more control over compute, storage, networking integration, and node operations.
That is a strong trade, not a free one. Teams with limited platform bandwidth should take it seriously. Teams that depend on direct node access, custom migrations, or deep data-plane control should evaluate it with clear eyes before calling it a universal upgrade.
Comments