Flux CD + OpenTofu: GitOps for Kubernetes and Infrastructure
HashiCorp switched Terraform to the Business Source License in August 2023. Within weeks, the OpenTofu fork was announced under the Linux Foundation, accepted as a CNCF project, and had a 1.6 release by early 2024. By mid-2026, OpenTofu 1.9 is the production-ready, drop-in replacement for Terraform 1.x — same HCL syntax, same provider ecosystem, same state format.
Flux CD v2 is the CNCF-graduated GitOps operator for Kubernetes, handling both application deployments and, via the tofu-controller, OpenTofu runs triggered from Git commits. Together they cover the full infrastructure lifecycle from a single Git workflow: push a change to an OpenTofu module, Flux detects it, runs tofu apply, and reconciles the resulting state back into the cluster. This guide covers Flux installation, the four Flux controllers, deploying applications with Kustomize and Helm, and wiring up the tofu-controller for infrastructure-as-code GitOps on EKS.
How Flux Works
Flux runs four controllers as pods in your cluster:
- Source Controller: watches Git repos, Helm repos, and OCI registries for changes. Fetches and caches artifacts.
- Kustomize Controller: applies Kustomize overlays and raw Kubernetes manifests from Sources.
- Helm Controller: installs and upgrades Helm charts declared as HelmRelease objects.
- Notification Controller: sends alerts to Slack, PagerDuty, or GitHub commit statuses on reconciliation events.
The reconciliation loop runs every few minutes by default. When you push a commit, the Source Controller detects it within one poll interval (default: 1 minute for Git). For immediate reconciliation, flux reconcile forces it manually.
Installing Flux on EKS
The flux bootstrap command does two things at once: it installs the controllers into your cluster and commits the component manifests into your Git repository, so Flux immediately starts managing itself. From that point forward, changing Flux configuration means opening a PR — not running kubectl:
# Install the Flux CLI
curl -s https://fluxcd.io/install.sh | sudo bash
# Verify prerequisites
flux check --pre
# Bootstrap Flux with GitHub (creates flux-system namespace and commits manifests)
export GITHUB_TOKEN=ghp_xxxxxxxxxxxx
export GITHUB_USER=my-org
flux bootstrap github \
--owner=$GITHUB_USER \
--repository=fleet-infra \
--branch=main \
--path=./clusters/production \
--personal=false \
--components-extra=image-reflector-controller,image-automation-controller
# Verify all controllers are running
kubectl get pods -n flux-system
# NAME READY STATUS RESTARTS
# helm-controller-5f7b8c9d6-xxxxx 1/1 Running 0
# kustomize-controller-7c6b4d8f9-xxxxx 1/1 Running 0
# notification-controller-8d9c5e7f6-xxxxx 1/1 Running 0
# source-controller-6e8f7d4c5-xxxxx 1/1 Running 0
After bootstrap, the fleet-infra repository at clusters/production/flux-system/ contains the Flux component manifests. Flux is now reconciling itself from Git — any change to those manifests is automatically applied.
Deploying Applications with Kustomize
Define a GitRepository source and a Kustomization that points to your app manifests:
# clusters/production/apps/source.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: my-api
namespace: flux-system
spec:
interval: 1m
url: https://github.com/my-org/my-api
ref:
branch: main
secretRef:
name: github-token # Secret with type: Opaque, data.username + data.password
# clusters/production/apps/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: my-api
namespace: flux-system
spec:
interval: 10m
sourceRef:
kind: GitRepository
name: my-api
path: "./k8s/overlays/production"
prune: true # Delete resources removed from Git
wait: true # Wait for resources to become Ready
timeout: 5m
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: my-api
namespace: my-api
The prune: true field is what makes Flux genuinely GitOps rather than just apply-on-commit: resources deleted from Git are deleted from the cluster. Without it, you’d accumulate orphaned resources.
# Watch reconciliation status
flux get kustomizations --watch
# NAME REVISION SUSPENDED READY MESSAGE
# my-api main@sha1:abc1234 False True Applied revision: main@sha1:abc1234
# Force immediate reconciliation
flux reconcile kustomization my-api --with-source
# Check health
flux get all -n flux-system
Deploying Helm Charts with HelmRelease
The Helm Controller manages chart installations declared as HelmRelease objects:
# clusters/production/monitoring/prometheus.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: prometheus-community
namespace: flux-system
spec:
interval: 12h
url: https://prometheus-community.github.io/helm-charts
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: kube-prometheus-stack
namespace: monitoring
spec:
interval: 1h
chart:
spec:
chart: kube-prometheus-stack
version: ">=58.0.0 <59.0.0"
sourceRef:
kind: HelmRepository
name: prometheus-community
namespace: flux-system
values:
grafana:
adminPassword: "${GRAFANA_ADMIN_PASSWORD}" # Substituted from Secret
prometheus:
prometheusSpec:
retention: 15d
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: gp3
resources:
requests:
storage: 50Gi
install:
remediation:
retries: 3
upgrade:
cleanupOnFail: true
remediation:
retries: 3
strategy: rollback
The version range ">=58.0.0 <59.0.0" tells Flux to automatically upgrade within the minor range but never cross a major version without a deliberate manifest change. The upgrade.remediation.strategy: rollback means a failed upgrade automatically rolls back to the last successful release.
OpenTofu on EKS via tofu-controller
The tofu-controller is a Flux-compatible controller that runs OpenTofu inside your cluster, reading .tf files from a Git source and reconciling infrastructure state. Install it alongside Flux:
# Install tofu-controller
flux create source helm tofu-controller \
--url=https://flux-iac.github.io/tofu-controller/ \
--namespace=flux-system
flux create helmrelease tofu-controller \
--chart=tofu-controller \
--source=HelmRepository/tofu-controller \
--namespace=flux-system \
--chart-version=">=0.16.0"
# Verify
kubectl get pods -n flux-system | grep tofu
# tofu-controller-5f9b8c7d-xxxxx 1/1 Running 0
A note on naming: the CRD is still called Terraform in the tofu-controller, not OpenTofu. The controller runs OpenTofu binaries underneath, but the Kubernetes API object kept its original name for backwards compatibility. Don’t let that confuse you when reading status output:
# clusters/production/infrastructure/vpc.yaml
apiVersion: infra.contrib.fluxcd.io/v1alpha2
kind: Terraform
metadata:
name: vpc
namespace: flux-system
spec:
interval: 15m
approvePlan: "auto" # auto-approve plans; set to "" for manual approval
path: ./terraform/vpc
sourceRef:
kind: GitRepository
name: fleet-infra
namespace: flux-system
vars:
- name: cluster_name
value: production-eks
- name: vpc_cidr
value: "10.0.0.0/16"
backendConfig:
customConfiguration: |
backend "s3" {
bucket = "my-tfstate-bucket"
key = "production/vpc/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-state-lock"
encrypt = true
}
serviceAccountName: tofu-runner # SA with IRSA permissions to manage VPC resources
When you push a commit that modifies terraform/vpc/, Flux detects the change via the Source Controller, the tofu-controller generates a plan, and with approvePlan: "auto" it applies immediately. For production infrastructure you’ll want approvePlan: "" instead — this pauses after the plan step and requires a human to set the approval field:
# Review the generated plan
kubectl get terraform vpc -n flux-system -o jsonpath='{.status.plan.message}'
# Approve the plan (after review)
kubectl patch terraform vpc -n flux-system \
--type=merge -p '{"spec":{"approvePlan":"plan-abc123"}}'
IRSA for the tofu-controller
When the tofu-controller runs tofu apply to provision an EKS cluster or VPC, the runner pod needs AWS credentials. The wrong approach is mounting an access key pair as a Secret — that key needs rotation, it can leak, and it’s not auditable at the API call level. IRSA gives the runner pod short-lived credentials tied to a specific IAM role without any secrets in the cluster:
# Create the IRSA role
aws iam create-role \
--role-name tofu-controller-runner \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:flux-system:tofu-runner"
}
}
}]
}'
# Attach appropriate policies (scope to what your TF modules actually need)
aws iam attach-role-policy \
--role-name tofu-controller-runner \
--policy-arn arn:aws:iam::aws:policy/AmazonVPCFullAccess
# The ServiceAccount for the runner pod
apiVersion: v1
kind: ServiceAccount
metadata:
name: tofu-runner
namespace: flux-system
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/tofu-controller-runner
Multi-Environment Setup
Most teams run at least two clusters (staging and production) with shared application code but different configuration values. The convention the Flux community settled on puts cluster-specific paths under clusters/ and reusable app manifests under apps/. This keeps staging and production separated at the cluster layer without duplicating application manifests:
fleet-infra/
├── clusters/
│ ├── production/
│ │ ├── flux-system/ # Flux components (bootstrapped)
│ │ ├── infrastructure/ # Kyverno, cert-manager, ingress, tofu resources
│ │ └── apps/ # Application kustomizations
│ └── staging/
│ ├── flux-system/
│ ├── infrastructure/
│ └── apps/
└── apps/
├── base/ # Shared app manifests
└── overlays/
├── production/ # Prod-specific patches
└── staging/ # Staging-specific patches
Each cluster’s flux-system/ contains a kustomization.yaml that references the infrastructure and apps directories in order:
# clusters/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- flux-system
- infrastructure
- apps
Flux processes these in dependency order: infrastructure (cert-manager, networking) reconciles before apps start deploying. Use dependsOn for explicit ordering:
# clusters/production/apps/kustomization.yaml (Flux CRD)
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
dependsOn:
- name: infrastructure # Apps wait for infrastructure Kustomization to be Ready
interval: 10m
sourceRef:
kind: GitRepository
name: fleet-infra
path: ./apps/overlays/production
prune: true
Notifications
The Notification Controller sends Slack or GitHub status updates when reconciliation completes or fails:
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: slack-ops
namespace: flux-system
spec:
type: slack
channel: "#deployments"
secretRef:
name: slack-webhook-url
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: on-call-alert
namespace: flux-system
spec:
providerRef:
name: slack-ops
eventSeverity: error
eventSources:
- kind: Kustomization
name: "*"
- kind: HelmRelease
name: "*"
- kind: Terraform
name: "*"
summary: "Flux reconciliation failure in production"
For the ArgoCD alternative to Flux — both are CNCF GitOps operators with different UX tradeoffs — the ArgoCD on EKS guide covers ArgoCD’s app-of-apps pattern and its UI. For the Helm charts Flux manages here, the Helm Charts on EKS guide covers chart structure and OCI registry publishing. For the OpenTofu state bucket and DynamoDB lock table setup, the GitHub Actions with Terraform guide shows the S3 backend configuration pattern that the tofu-controller’s backendConfig references.
Comments