Container Security on EKS: Trivy, ECR Scanning, and Runtime Protection

Bits Lovers
Written by Bits Lovers on
Container Security on EKS: Trivy, ECR Scanning, and Runtime Protection

I’ve watched teams build gorgeous Kubernetes clusters on EKS, then basically shrug at container security. The cluster runs great, pipelines are solid, autoscaling hums along – and then someone actually runs a vulnerability scan and finds 847 CVEs sitting in the running workloads, 23 of them critical. That’s not a made-up number, either. I pulled it straight from a real cluster audit last quarter.

Container security on EKS isn’t something you can solve with a single tool or a quick checklist. It’s a discipline you have to apply across every layer – from the base image you pick all the way through to what your workloads actually do at runtime. In this post, we’re going to cover the whole picture: image scanning with Trivy, registry-level scanning with Amazon ECR, admission control to keep bad images out, runtime protection with Falco, network policies, secrets management, supply chain integrity, and the compliance frameworks that hold it all together.

If you’re still spinning up your first EKS cluster, head over to the EKS getting started guide first and circle back once things are up and running. Hardening security on a cluster that isn’t even operational yet is putting the cart before the horse.

The Container Security Landscape in 2026

The threat model for containers has shifted pretty dramatically. Three years ago, most of us were mainly worried about base image vulnerabilities. Today the attack surface is way wider and the stakes keep climbing:

  • Software supply chain attacks keep climbing year after year, with attackers going straight after build pipelines, dependency feeds, and container registries.
  • Runtime exploitation of application-level vulnerabilities (think Log4Shell-class events) means a bug in a dependency you didn’t even realize you had can take down the whole workload.
  • Regulatory requirements now demand Software Bill of Materials (SBOM) generation and vulnerability disclosure timelines. Between the EU Cyber Resilience Act and US Executive Order 14028, this is no longer optional for a lot of organizations.
  • Cryptomining and resource abuse is still one of the most common things attackers do once they get inside a container.

On the bright side, the tooling has come a long way. Open source projects like Trivy and Falco are production-grade now, AWS has beefed up ECR scanning with Amazon Inspector integration, and the Kubernetes community has settled on some clear patterns for admission control and policy enforcement.

The Six Layers of Container Security

Container security is a layered discipline. No single layer gets the job done by itself. Here’s how those layers break down on EKS:

Layer Scope Primary Tools What It Catches
Image Build time Trivy, Grype, Snyk Known CVEs, secrets, misconfigs in images
Registry Push time ECR Enhanced Scanning, Amazon Inspector Vulnerabilities before deployment
Orchestrator Deploy time OPA Gatekeeper, Kyverno, Pod Security Standards Policy violations, privileged pods, bad configs
Runtime Running workloads Falco, GuardDuty Runtime Monitoring Anomalous behavior, shell spawns, unexpected network calls
Network Pod-to-pod traffic Network Policies, Calico, Cilium Lateral movement, unauthorized communication
Host Node-level Bottlerocket, CIS Benchmarks, kube-bench Kernel exploits, node-level privilege escalation

Each layer has its own tooling, its own failure modes, and its own way of responding when things go wrong. You really need all six working in concert. Let’s walk through each one in detail, starting with the foundation: image scanning.

Layer 1: Image Scanning with Trivy

Trivy (built by Aqua Security) has become the go-to open source vulnerability scanner across the Kubernetes ecosystem – and for good reason. It covers a lot of ground: OS package vulnerabilities, language-specific dependency issues (Python, Node.js, Java, Go, Rust, you name it), hardcoded secrets, infrastructure-as-code misconfigurations, and even license compliance.

Installing Trivy

On macOS:

brew install trivy

On Linux:

sudo apt-get install wget apt-transport-https gnupg lsb-release
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo gpg --dearmor -o /usr/share/keyrings/trivy.gpg
echo "deb [signed-by=/usr/share/keyrings/trivy.gpg] https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main" | sudo tee /etc/apt/sources.list.d/trivy.list
sudo apt-get update
sudo apt-get install trivy

Scanning a Container Image

The most common operation. Scan an image before pushing it to ECR:

trivy image --severity HIGH,CRITICAL --format table your-ecr-repo.amazonaws.com/my-app:v1.2.3

Output looks like this:

your-ecr-repo.amazonaws.com/my-app:v1.2.3 (debian 12.5)
========================================================
Total: 42 (HIGH: 31, CRITICAL: 11)

┌──────────────────┬────────────────┬──────────┬───────────────────┐
│     Library      │ Vulnerability  │ Severity │ Installed Version │
├──────────────────┼────────────────┼──────────┼───────────────────┤
│ libcurl4         │ CVE-2026-12345 │ CRITICAL │ 7.88.1-10         │
│ openssl          │ CVE-2026-67890 │ HIGH     │ 3.0.11-1          │
│ libxml2          │ CVE-2026-11111 │ HIGH     │ 2.9.14+dfsg-1.3   │
└──────────────────┴────────────────┴──────────┴───────────────────┘

For CI/CD integration, use JSON or table format and set exit codes:

trivy image \
  --severity HIGH,CRITICAL \
  --exit-code 1 \
  --format json \
  --output trivy-results.json \
  your-ecr-repo.amazonaws.com/my-app:v1.2.3

The --exit-code 1 flag is what makes this useful in CI – it returns a non-zero exit code whenever HIGH or CRITICAL vulnerabilities show up, which is how you fail a pipeline build when things cross your threshold.

Scanning a Kubernetes Cluster

Trivy can scan all workloads running in a cluster, not just individual images:

# Scan the entire cluster
trivy k8s --cluster --report summary

# Scan a specific namespace
trivy k8s --namespace production --report all

# Generate an SBOM for all workloads in a namespace
trivy k8s --namespace production --format cyclonedx-json > production-sbom.json

Scanning for Secrets and Misconfigurations

Trivy isn’t just for CVEs, by the way. It’ll also catch hardcoded credentials, API keys, and private keys that someone accidentally baked into an image:

# Scan for secrets
trivy image --scanners secret your-ecr-repo.amazonaws.com/my-app:v1.2.3

# Scan Infrastructure as Code files
trivy config ./terraform/

# Scan a Dockerfile for misconfigurations
trivy config Dockerfile

Generating an SBOM

These days, generating a Software Bill of Materials is something more and more organizations are being required to do for compliance:

# Generate SPDX format SBOM
trivy image --format spdx-json --output sbom.json your-ecr-repo.amazonaws.com/my-app:v1.2.3

# Generate CycloneDX format SBOM
trivy image --format cyclonedx-json --output sbom.cdx.json your-ecr-repo.amazonaws.com/my-app:v1.2.3

Trivy in Server Mode

For high-volume environments, run Trivy as a server to avoid downloading the vulnerability database on every scan:

# Start the Trivy server
trivy server --listen 0.0.0.0:4954

# Scan using the server
trivy image --server http://trivy-server:4954 your-ecr-repo.amazonaws.com/my-app:v1.2.3

Layer 2: Amazon ECR Security

Amazon ECR gives you two scanning modes: basic scanning (which uses the Clair-based open source engine) and enhanced scanning (powered by Amazon Inspector). For production workloads, enhanced scanning is the way to go.

Enabling Enhanced Scanning

Enable enhanced scanning on an ECR repository using Terraform:

resource "aws_ecr_repository" "app" {
  name                 = "my-app"
  image_tag_mutability = "IMMUTABLE"

  image_scanning_configuration {
    scan_on_push = true
  }
}

resource "aws_ecr_registry_scanning_configuration" "enhanced" {
  scan_type = "ENHANCED"

  rule {
    scan_frequency = "SCAN_ON_PUSH"
    repository_filter {
      filter      = "*"
      filter_type = "WILDCARD"
    }
  }

  rule {
    scan_frequency = "CONTINUOUS_SCAN"
    repository_filter {
      filter      = "production-*"
      filter_type = "WILDCARD"
    }
  }
}

Pay attention to the CONTINUOUS_SCAN frequency for production repositories here. What this does is have Amazon Inspector keep re-scanning your images as new vulnerabilities get published – even if the image itself hasn’t changed at all. That matters more than you’d think, because a perfectly clean image pushed on Monday can suddenly become vulnerable on Wednesday when a new CVE drops.

ECR Scan Types Comparison

Feature Basic Scanning Enhanced Scanning
Engine Open source (Clair) Amazon Inspector
OS packages Yes Yes
Application packages Limited Full (Python, Node.js, Java, Go, .NET, Rust)
Continuous re-scan No Yes
Lambda function scanning No Yes
Export to Security Hub No Yes
Cost Free Per-image pricing
Scan on push Yes Yes

Querying Scan Results

After a scan completes, query the results programmatically:

# Get scan findings for a specific image
aws ecr describe-image-scan-findings \
  --repository-name my-app \
  --image-id imageTag=v1.2.3 \
  --region us-east-1

# List all images with critical findings using AWS CLI
aws ecr describe-images \
  --repository-name my-app \
  --query 'imageDetails[?imageScanFindingsSummary.findingSeverityCounts.CRITICAL > `0`]' \
  --region us-east-1

ECR Lifecycle Policies for Vulnerable Images

Automatically clean up old or vulnerable images with lifecycle policies:

aws ecr put-lifecycle-policy \
  --repository-name my-app \
  --lifecycle-policy-text '{
    "rules": [
      {
        "rulePriority": 1,
        "description": "Remove untagged images older than 14 days",
        "selection": {
          "tagStatus": "untagged",
          "countType": "sinceImagePushed",
          "countUnit": "days",
          "countNumber": 14
        },
        "action": { "type": "expire" }
      },
      {
        "rulePriority": 2,
        "description": "Keep only last 10 tagged images",
        "selection": {
          "tagStatus": "tagged",
          "tagPrefixList": ["v"],
          "countType": "imageCountMoreThan",
          "countNumber": 10
        },
        "action": { "type": "expire" }
      }
    ]
  }'

For more on ECR configuration including pull-through caches and referrer discovery, see the Amazon ECR pull-through cache and referrer discovery guide.

Enforcing Image Tag Immutability

Image tag immutability is one of those quiet settings that punches above its weight – it stops anyone with push access from overwriting a “clean” image tag with something malicious:

resource "aws_ecr_repository" "app" {
  name                 = "my-app"
  image_tag_mutability = "IMMUTABLE"
}

It’s a simple toggle with outsized impact on supply chain integrity. Once v1.2.3 gets pushed, that tag is locked – nobody can sneak a different image behind it. Need to ship an update? Push a new tag.

Layer 3: Admission Control

Admission controllers sit between the Kubernetes API server and whatever’s trying to create or update resources. Think of this layer as your last line of defense before a workload actually lands on the cluster. Two tools dominate this space: OPA Gatekeeper and Kyverno.

Pod Security Standards

Before you reach for a full-blown policy engine, though, enable Kubernetes’ built-in Pod Security Standards. This gives you baseline security enforcement right at the namespace level:

apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

The three profiles available:

Profile What It Blocks
Privileged Nothing (unrestricted)
Baseline Host namespace sharing, host ports, privileged containers, host path mounts
Restricted All baseline restrictions plus: must run as non-root, drop all capabilities, no privilege escalation, read-only root filesystem

For most production clusters, we’d suggest starting with baseline enforcement and restricted audit/warn, then gradually migrating workloads to full restricted enforcement as you go.

Kyverno Policies for Container Security

Kyverno is a policy engine built specifically for Kubernetes, and one of the things I like about it is that policies are just regular Kubernetes resources – no new language to learn. Here are the policies I’d consider essential:

Block images from unapproved registries:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: Enforce
  background: false
  rules:
    - name: validate-registry
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "Images must come from the approved ECR registry."
        foreach:
          - list: "request.object.spec.containers[].image"
            pattern:
              "request.object.spec.containers[].image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/*"

Block privileged containers:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: block-privileged-containers
spec:
  validationFailureAction: Enforce
  background: false
  rules:
    - name: check-privileged
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "Privileged containers are not allowed in production."
        pattern:
          spec:
            containers:
              - securityContext:
                  privileged: false

Require resource limits:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  background: false
  rules:
    - name: check-resource-limits
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "CPU and memory resource limits are required."
        pattern:
          spec:
            containers:
              - resources:
                  limits:
                    memory: "?*"
                    cpu: "?*"

OPA Gatekeeper Example

OPA Gatekeeper takes a different approach, using Rego for policy definitions. You get more expressive power that way, but there’s a steeper learning curve to deal with:

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8sblocklatesttag
spec:
  crd:
    spec:
      names:
        kind: K8sBlockLatestTag
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8sblocklatesttag
        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          endswith(container.image, ":latest")
          msg := sprintf("Container <%v> uses the :latest tag which is not allowed.", [container.name])
        }
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sBlockLatestTag
metadata:
  name: block-latest-tag
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
    namespaces:
      - "production"
      - "staging"

Admission Control Tool Comparison

Feature Kyverno OPA Gatekeeper
Policy language Native YAML Rego
Learning curve Low Medium to high
Mutation support Yes Limited
Image verification Built-in (Cosign, Notary) Requires external integration
Generate resources Yes No
Policy library Extensive community library Limited
Performance Lower latency per request Higher latency for complex Rego
Best for Most EKS deployments Complex policy logic

For most teams running EKS, Kyverno is the better starting point. The native YAML policy format means your platform team can write and maintain policies without having to learn Rego from scratch. Reach for OPA Gatekeeper when you genuinely need the full expressiveness of a general-purpose policy language.

Layer 4: Runtime Security with Falco

Image scanning and admission control do a solid job of protecting you at build and deploy time. But what happens when a vulnerability nobody knew about gets exploited inside a running container? That’s where runtime security comes in. Falco, now a CNCF graduated project, has become the standard tool for exactly this scenario.

How Falco Works

Falco runs as a DaemonSet on every node in your EKS cluster. It hooks into system calls at the kernel level (using either a kernel module or an eBPF probe) and checks each syscall against a set of rules. Whenever something matches, Falco fires off an alert.

Under the hood, it works like this:

  1. A syscall happens inside a container (say, a shell process spawns)
  2. The Falco driver captures the syscall event
  3. The Falco userspace program checks the event against its loaded rules
  4. If something matches, Falco outputs an alert to stdout, a file, or whatever output channel you’ve configured

Installing Falco on EKS

Install Falco using the Helm chart:

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update

helm install falco falcosecurity/falco \
  --namespace falco \
  --create-namespace \
  --set driver.kind=ebpf \
  --set falco.jsonOutput=true \
  --set falco.httpOutput.enabled=true \
  --set falco.httpOutput.url="http://falcosidekick:2801/"

That driver.kind=ebpf setting matters quite a bit on EKS. eBPF is the preferred probe type because it doesn’t require compiling a kernel module – something that can be a real headache on managed node groups running Bottlerocket AMIs.

Also deploy Falcosidekick to handle alert routing:

helm install falcosidekick falcosecurity/falcosidekick \
  --namespace falco \
  --set config.slack.webhookurl="https://hooks.slack.com/services/..." \
  --set config.aws.cloudwatch.loggroup="/falco/alerts" \
  --set config.aws.cloudwatch.region="us-east-1" \
  --set config.aws.cloudwatch.sendcheck=true

Essential Falco Rules for EKS

Falco comes with a solid default rule set out of the box. But for EKS workloads specifically, these are the rules we’ve found matter most:

Detect a shell spawned inside a container:

- rule: Terminal Shell in Container
  desc: A shell was spawned in a container with an attached terminal
  condition: >
    spawned_process and container and shell_procs and proc.tty != 0
    and container_entrypoint
  output: >
    A shell was spawned in a container with an attached terminal
    (user=%user.name user_uid=%user.uid container_id=%container.id
    image=%container.image.repository process=%proc.name
    parent=%proc.pname command=%proc.cmdline terminal=%proc.tty)
  priority: WARNING
  tags: [container, shell, mitre_execution]

Detect unexpected outbound network connections:

- rule: Unexpected Outbound Connection
  desc: Detect outbound connections to unexpected ports from containers
  condition: >
    evt.type=connect and container and
    not fd.sport in (80, 443, 8080) and
    not k8s_containers and
    evt.res=success
  output: >
    Unexpected outbound connection from container
    (user=%user.name container=%container.name image=%container.image.repository
    connection=%fd.name port=%fd.rport command=%proc.cmdline)
  priority: WARNING
  tags: [network, container, mitre_command_and_control]

Detect reading of sensitive files:

- rule: Read Sensitive File
  desc: Detect attempts to read sensitive files like /etc/shadow
  condition: >
    evt.type=open and evt.arg.flags contains O_RDONLY and
    (fd.name startswith /etc/shadow or
     fd.name startswith /etc/passwd or
     fd.name startswith /etc/ssh/ or
     fd.name startswith /root/.ssh/ or
     fd.name startswith /var/run/secrets/)
    and container
  output: >
    Sensitive file read in container
    (user=%user.name file=%fd.name container=%container.name
     image=%container.image.repository command=%proc.cmdline)
  priority: CRITICAL
  tags: [filesystem, container, mitre_credential_access]

Detect namespace manipulation (potential container escape):

- rule: Change Namespace via Unshare
  desc: Detect namespace privilege changes via unshare syscall
  condition: >
    evt.type=unshare and container and
    not thread.cap_permitted contains CAP_SYS_ADMIN
  output: >
    Namespace privilege change via unshare
    (user=%user.name uid=%user.uid container=%container.name
     image=%container.image.repository command=%proc.cmdline)
  priority: CRITICAL
  tags: [container, privilege_escalation, mitre_privilege_escalation]

Falco Alert Output Examples

When Falco catches something, the alert it generates packs in a lot of useful context:

{
  "output": "A shell was spawned in a container with an attached terminal (user=root user_uid=0 container_id=abc123 image=my-app process=bash parent=runc command=bash terminal=34816)",
  "priority": "Warning",
  "rule": "Terminal Shell in Container",
  "time": "2026-04-22T14:32:15.123456789Z",
  "output_fields": {
    "container.id": "abc123",
    "container.image.repository": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app",
    "proc.name": "bash",
    "proc.cmdline": "bash",
    "user.name": "root",
    "user.uid": 0
  }
}

AWS GuardDuty Runtime Monitoring

AWS offers a managed alternative to self-hosting Falco through GuardDuty EKS Runtime Monitoring. It deploys the GuardDuty security agent as a DaemonSet and gives you built-in detections without having to manage rule files yourself.

There’s a tradeoff, though. GuardDuty is fully managed and plugs right into the AWS security ecosystem, but you don’t get the same level of control over detection rules and alert formatting that you’d have with Falco. Plenty of teams actually run both – Falco for custom detections tuned to their specific applications, and GuardDuty for the broad, AWS-managed coverage.

Layer 5: Network Policies

Here’s something that catches people off guard: by default, every pod in a Kubernetes cluster can talk to every other pod. That flat network model is great for development, but it’s a real liability in production. Network policies let you lock down pod-to-pod communication.

Default Deny All Traffic

Start with a default deny policy in every namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

That blocks all inbound and outbound traffic for every pod in the namespace. From there, you layer allow policies on top of this deny-everything base.

Allow Specific Application Traffic

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend-api
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend
      ports:
        - port: 8080
          protocol: TCP

Allow DNS Egress

One thing people often forget: every pod needs DNS resolution. Once you’ve put that default deny in place, you have to explicitly allow egress to the CoreDNS service:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - port: 53
          protocol: UDP
        - port: 53
          protocol: TCP

Network Policy Enforcement on EKS

EKS supports two CNI plugins with network policy enforcement:

CNI Plugin Policy Engine Performance Best For
Amazon VPC CNI Built-in (EKS 1.25+) Line rate Most EKS deployments
Calico Calico policy engine High Advanced policy needs
Cilium eBPF-based Very high High-scale environments

Starting with EKS 1.25, the Amazon VPC CNI plugin supports Kubernetes NetworkPolicy natively, which covers most use cases just fine. Turn to Calico or Cilium when you need advanced features like global network policies, HTTP-level rules, or eBPF-based observability.

For a deeper dive into network-level protection, see the AWS WAF rules deep dive which covers the ingress protection layer in front of your cluster.

Layer 6: Host Security

Everything runs on top of the host layer, which makes it the foundation of the whole stack. If a node gets compromised, the attacker essentially gets the keys to every container sitting on that node.

Bottlerocket as the Host OS

Bottlerocket is Amazon’s container-focused host OS, and it ships with a dramatically smaller attack surface compared to general-purpose Linux distros:

Feature Bottlerocket Amazon Linux 2023 Ubuntu 22.04
SSH access Disabled by default Enabled Enabled
Package manager None dnf apt
Shell No interactive shell bash bash
Attack surface Minimal Standard Standard
Immutable filesystem Yes No No
CIS Benchmark profile Level 1 and 2 Level 1 Level 1

Our advice: use Bottlerocket for EKS managed node groups whenever you don’t need custom host-level packages. It’s probably the single most impactful host security decision you can make.

CIS Benchmarks with kube-bench

Run CIS Kubernetes Benchmark tests against your EKS nodes:

# Install kube-bench
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job-eks.yaml

# View results
kubectl logs job/kube-bench

The benchmark runs your setup against the CIS EKS Benchmark recommendations, covering everything from control plane configuration to worker node settings and policies.

EKS Managed Node Group Configuration

resource "aws_eks_node_group" "production" {
  cluster_name    = aws_eks_cluster.main.name
  node_group_name = "production"
  node_role_arn   = aws_iam_role.node.arn
  subnet_ids      = aws_subnet.private[*].id

  scaling_config {
    desired_size = 3
    max_size     = 10
    min_size     = 3
  }

  instance_types = ["m6i.large"]

  # Use Bottlerocket AMI
  ami_type       = "BOTTLEROCKET_x86_64"

  labels = {
    "workload" = "production"
  }

  tags = {
    SecurityLevel = "restricted"
  }
}

Secrets Management

Let’s be blunt about this: Kubernetes stores Secrets as base64-encoded objects in etcd. That’s encoding, not encryption. If you’re running production EKS clusters, you need an actual secrets management strategy.

AWS Secrets Manager Integration

Use the AWS Secrets and Configuration Provider (ASCP) for the CSI Secrets Driver to mount secrets from AWS Secrets Manager directly into pods:

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: app-db-credentials
  namespace: production
spec:
  provider: aws
  parameters:
    objects: |
      - objectName: "prod/db-credentials"
        objectType: "secretsmanager"
        jmesPath:
          - path: username
            objectAlias: dbusername
          - path: password
            objectAlias: dbpassword

Then reference it in your pod spec:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
  namespace: production
spec:
  containers:
    - name: app
      image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:v1.2.3
      env:
        - name: DB_USERNAME
          valueFrom:
            secretKeyRef:
              name: app-db-credentials
              key: dbusername
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: app-db-credentials
              key: dbpassword
      volumeMounts:
        - name: secrets-store
          mountPath: "/mnt/secrets-store"
          readOnly: true
  volumes:
    - name: secrets-store
      csi:
        driver: secrets-store.csi.k8s.io
        readOnly: true
        volumeAttributes:
          secretProviderClass: "app-db-credentials"

For automated secret rotation, see the AWS Secrets Manager rotation with Lambda guide.

Envelope Encryption for etcd

Enable envelope encryption for the Kubernetes secrets stored in etcd. This adds an additional encryption layer using a KMS key:

resource "aws_eks_cluster" "main" {
  name     = "production"
  role_arn = aws_iam_role.cluster.arn

  encryption_config {
    provider {
      key_arn = aws_kms_key.eks.arn
    }
    resources = ["secrets"]
  }

  vpc_config {
    subnet_ids = aws_subnet.private[*].id
  }
}

resource "aws_kms_key" "eks" {
  description             = "EKS Secret Encryption Key"
  deletion_window_in_days = 30
  enable_key_rotation     = true
}

SBOM and Supply Chain Security

Supply chain security used to be something you’d do if you had extra time. Not anymore – it’s now a regulatory requirement in many cases. Here’s how to put it together end-to-end on EKS.

Generate and Sign Artifacts

The full supply chain pipeline looks like this: generate an SBOM, scan it for vulnerabilities, sign the image with Cosign, and then verify that signature at admission time.

# Step 1: Generate SBOM
trivy image --format cyclonedx-json \
  --output sbom.json \
  123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:v1.2.3

# Step 2: Sign the image with Cosign
cosign sign \
  --key cosign.key \
  123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:v1.2.3

# Step 3: Attach the SBOM as an OCI referrer
cosign attach sbom \
  --sbom sbom.json \
  123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:v1.2.3

# Step 4: Verify the signature
cosign verify \
  --key cosign.pub \
  123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:v1.2.3

Verify Signatures at Admission with Kyverno

Enforce that only signed images can be deployed to production:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-image-signature
spec:
  validationFailureAction: Enforce
  background: false
  webhookTimeoutSeconds: 30
  rules:
    - name: verify-cosign-signature
      match:
        any:
          - resources:
              kinds:
                - Pod
      verifyImages:
        - imageReferences:
            - "123456789012.dkr.ecr.us-east-1.amazonaws.com/*"
          failureAction: Enforce
          attestors:
            - entries:
                - keys:
                    publicKeys: |-
                      -----BEGIN PUBLIC KEY-----
                      MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE...
                      -----END PUBLIC KEY-----

Diagram: Container security pipeline architecture

CI/CD Security Pipeline

None of these security layers matter much if they’re not wired into your CI/CD pipeline. Let’s look at a complete GitHub Actions workflow that strings together the full scanning chain:

name: Container Security Pipeline

on:
  push:
    branches: [main]
    paths:
      - 'src/**'
      - 'Dockerfile'
      - 'k8s/**'

env:
  ECR_REGISTRY: 123456789012.dkr.ecr.us-east-1.amazonaws.com
  IMAGE_NAME: my-app

jobs:
  build-and-scan:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
      security-events: write

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-actions
          aws-region: us-east-1

      - name: Login to ECR
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build Image
        run: |
          docker build -t $ECR_REGISTRY/$IMAGE_NAME:$ .

      - name: Trivy Vulnerability Scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: '$/$:$'
          format: 'sarif'
          output: 'trivy-results.sarif'
          severity: 'HIGH,CRITICAL'
          exit-code: '1'

      - name: Trivy SBOM Generation
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: '$/$:$'
          format: 'cyclonedx-json'
          output: 'sbom.json'
          scan-type: 'image'

      - name: Trivy Secret Scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: '$/$:$'
          format: 'table'
          scan-type: 'image'
          scanners: 'secret'

      - name: Sign Image with Cosign
        uses: sigstore/cosign-installer@v3
      - run: |
          cosign sign --key env://COSIGN_PRIVATE_KEY \
            $ECR_REGISTRY/$IMAGE_NAME:$
        env:
          COSIGN_PRIVATE_KEY: $
          COSIGN_PASSWORD: $

      - name: Push to ECR
        run: |
          docker push $ECR_REGISTRY/$IMAGE_NAME:$
          docker tag $ECR_REGISTRY/$IMAGE_NAME:$ \
                     $ECR_REGISTRY/$IMAGE_NAME:latest
          docker push $ECR_REGISTRY/$IMAGE_NAME:latest

      - name: Upload Trivy Results to Security Tab
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: 'trivy-results.sarif'

So before an image ever reaches ECR, this pipeline runs four security gates: vulnerability scan (fail on HIGH/CRITICAL), SBOM generation, secret detection, and image signing. The push only happens if every single check passes.

Vulnerability Scanner Comparison

Picking the right scanner really comes down to your budget, compliance requirements, and how you prefer to operate. Let’s break down the four leading options:

Feature Trivy (Open Source) Snyk Container Aqua Platform Prisma Cloud
License Apache 2.0 Commercial Commercial Commercial
OS package scanning Yes Yes Yes Yes
Language-specific scanning 15+ languages 12+ languages 15+ languages 12+ languages
IaC scanning Yes Yes Yes Yes
Secret scanning Yes Yes Yes Yes
SBOM generation SPDX, CycloneDX CycloneDX SPDX, CycloneDX CycloneDX
Kubernetes operator Trivy Operator Snyk Controller Aqua Enforcer Prisma Defender
Runtime protection No (use Falco) No Yes Yes
CI/CD integrations All major platforms All major platforms All major platforms All major platforms
Cost Free ~$100/developer/mo Custom pricing Custom pricing
Best for Budget-conscious teams, full pipeline coverage Developer-centric workflows Enterprise runtime + image security Enterprise with existing Palo Alto stack

Chart: Vulnerability scanner comparison

For most teams running EKS, we’d go with Trivy for build-time and CI/CD scanning paired with Falco for runtime protection. You get full coverage without any commercial licensing costs. Consider Snyk if your developers are after tight IDE integration, Aqua if you need commercial-grade runtime protection, or Prisma Cloud if you’re already deep in the Palo Alto ecosystem.

Compliance Frameworks

Container security doesn’t exist in a vacuum, of course. Most organizations have to show compliance with one or more regulatory frameworks, so let’s connect the dots.

Mapping Security Controls to Compliance

Security Control SOC 2 PCI-DSS HIPAA
Image vulnerability scanning CC6.1 6.2 164.308(a)(5)
Registry scanning and enforcement CC6.1 6.2 164.308(a)(5)
Admission control policies CC6.3 6.3 164.312(a)(1)
Runtime threat detection CC7.2 10.4 164.312(b)
Network segmentation CC6.6 1.3 164.312(e)(1)
Secrets encryption at rest CC6.1 3.4 164.312(a)(2)(iv)
SBOM and supply chain integrity CC6.1 6.2 164.308(a)(5)
Audit logging and monitoring CC7.2 10.2 164.312(b)
Host hardening (CIS Benchmark) CC6.1 2.2 164.310(b)

CIS Benchmark Checklist for EKS

Work through this checklist for every EKS cluster you run in production:

Item Benchmark Section Action Verification
Control plane logging 3.1 Enable all five log types aws eks describe-cluster --query 'cluster.logging'
API server audit logs 3.2 Configure audit log policy Verify log entries in CloudWatch
etcd encryption 4.1 Enable KMS envelope encryption Check encryptionConfig in cluster spec
Node IAM least privilege 4.2 Use IRSA, minimize node role Review IAM policy attached to node role
Pod Security Standards 5.1 Enforce baseline or restricted Check namespace labels
Network policies 5.2 Default deny in all namespaces kubectl get networkpolicy --all-namespaces
Secrets encryption 5.3 Use AWS Secrets Manager + CSI driver Check SecretProviderClass resources
RBAC least privilege 5.4 Audit cluster role bindings kubectl get clusterrolebindings -o yaml
Image provenance 5.5 Verify image signatures at admission Check Kyverno/Gatekeeper policies

Cost Analysis

Security tooling isn’t free, so let’s look at what you’re actually going to spend. Here’s a realistic breakdown for a mid-size EKS deployment (5 clusters, 50 nodes, 200 workloads):

Component Monthly Cost Notes
Trivy (self-hosted) $0 (open source) Compute for Trivy server: ~$50 if using dedicated instance
ECR Enhanced Scanning $200-400 Based on image count and re-scan frequency
Falco (self-hosted) $0 (open source) DaemonSet runs on each node, minimal resource usage
Falcosidekick + CloudWatch $50-100 Log ingestion and alert routing
Kyverno $0 (open source) Runs as in-cluster deployment
AWS KMS (etcd encryption) $1-5 per key Key usage charges
AWS Secrets Manager $0.40 per secret/month Plus API call charges
GuardDuty EKS Runtime $4.50 per million events Varies by cluster activity
S3 (SBOM storage) $5-20 Minimal storage costs
Total Estimated $300-600/month For the full stack across 5 clusters

Stack that up against commercial alternatives and the math gets interesting fast. Snyk Container alone runs about $100 per developer per month. Aqua Platform and Prisma Cloud typically start north of $20,000 a year. The open source stack (Trivy + Falco + Kyverno) delivers something like 80-90% of the coverage at a fraction of the cost. The catch, of course, is operational ownership – your team has to manage the tooling instead of paying a vendor to handle it.

Best Practices Summary

Having implemented container security across dozens of EKS clusters, here’s what we’ve found actually moves the needle:

  1. Start with image scanning in CI/CD. This one’s the highest-impact, lowest-effort security control you can add. Wire Trivy into your pipeline and fail builds on HIGH/CRITICAL vulnerabilities. Do this before anything else.

  2. Use Bottlerocket for node groups. The reduced attack surface alone makes the switch worthwhile. No SSH, no package manager, no interactive shell. If you need to debug a node, reach for SSM Session Manager instead.

  3. Enable ECR continuous scanning. A clean image today can turn into a vulnerable one tomorrow. Continuous re-scanning catches new CVEs in images you deployed weeks or even months ago.

  4. Default deny network policies. Put a default deny-all policy in every namespace, then layer specific allow rules on top. This is what stops lateral movement if a pod gets compromised.

  5. Enforce Pod Security Standards. At a minimum, enforce the baseline profile. Then work your way toward restricted across all production namespaces.

  6. Deploy Falco for runtime detection. Image scanning can’t help you against zero-day exploits in running workloads. Falco catches anomalous behavior right at the syscall level.

  7. Sign images and verify at admission. Use Cosign to sign images in your pipeline, and have Kyverno verify those signatures before any pods get created. That way, tampered images never get a chance to run.

  8. Use AWS Secrets Manager, not Kubernetes Secrets. Kubernetes Secrets are base64-encoded, not encrypted (well, unless you enable envelope encryption). Use the CSI driver to mount secrets from AWS Secrets Manager instead.

  9. Generate SBOMs for every image. Regulatory requirements are making this mandatory in a lot of places now. Even without a mandate, SBOMs are absolute lifesavers when a new CVE drops and you need to quickly figure out which images are affected.

  10. Automate everything. Manual security checks simply don’t scale. Every control we’ve talked about in this post should be baked into your CI/CD pipeline and cluster configuration.

Putting It All Together

Container security on EKS isn’t really about picking the right tool – it’s about layering the right controls across every stage of the container lifecycle. Scan images at build time with Trivy. Enforce policies at the registry with ECR Enhanced Scanning. Block bad configurations at admission with Kyverno. Detect runtime threats with Falco. Segment the network with NetworkPolicies. Protect secrets with AWS Secrets Manager. Sign images with Cosign and verify them at the gate.

Diagram: Container security pipeline architecture

No single layer catches everything, and that’s the whole point. The shell that slips past your admission controller? Falco catches it at runtime. The CVE that nobody knew about yesterday? ECR continuous scanning flags it tomorrow. The lateral movement attempt? Your default-deny network policy blocks it cold. That’s how layered defense actually works.

The full stack we covered here (Trivy, ECR Enhanced Scanning, Kyverno, Falco, NetworkPolicies, Secrets Manager, Cosign) runs about $300-600 per month for a mid-size deployment and delivers coverage comparable to commercial platforms that cost five to ten times as much. The tradeoff is that you own the operations – but that investment pays for itself pretty quickly.

For more on the broader security architecture around your EKS workloads, see the Docker sandboxes and microVM security guide for isolation strategies beyond traditional containers.

Bits Lovers

Bits Lovers

Professional writer and blogger. Focus on Cloud Computing.

Comments

comments powered by Disqus