Terraform Debug

Bits Lovers
Written by Bits Lovers on
Terraform Debug

When Terraform does not do what you expect, you need to figure out why. This post covers the debugging tools Terraform gives you and how I use them in practice.

How Terraform Breaks Things Down

Terraform splits its work into four parts:

  • Language - Your HCL configuration syntax
  • Core - Validation and state management
  • State - The mapping between your config and real resources
  • Provider - The API calls to your cloud

Each part logs separately, which helps you narrow down where the problem lives.

Language Errors

A syntax mistake in your .tf files. Run terraform fmt to check your HCL before planning:

terraform fmt

This rewrites your config to canonical format and flags interpolation errors like ${ typos or missing quotes.

State Errors

The state file drift from reality. Maybe someone edited resources manually, or the state got corrupted. These cause the “resource not found” or “destructive changes” surprises during apply.

Core Errors

Core handles everything else: syntax validation, dependency resolution, change calculation, and provider communication. When something does not fit the schema or a dependency cycle exists, core catches it.

Provider Errors

Providers translate Terraform’s resource graph into API calls. If your AWS credentials are bad, or a Security Group does not exist, the provider reports it here.

Log Levels

Terraform uses these log levels (most to least verbose):

Level What you get
TRACE Every function call and return
DEBUG Internal operations
INFO High-level operations
WARN Potential problems
ERROR Failures that block execution

Enabling Logs

Set TF_LOG to enable core logging:

export TF_LOG=TRACE
terraform plan

Logs go to stderr. For large runs, save them to a file:

export TF_LOG=TRACE
export TF_LOG_PATH=./terraform.log
terraform plan

TF_LOG_PATH without TF_LOG does nothing.

Structured JSON Logging (Terraform 1.10+)

Terraform 1.10 (November 2024) introduced a new structured log format for TF_LOG. The JSON format is machine-readable and suitable for log aggregation pipelines:

export TF_LOG=JSON
export TF_LOG_PATH=./terraform-debug.log
terraform plan 2>&1 | jq .

For production debugging in CI/CD, TF_LOG=JSON produces parseable output that can be shipped to systems like Datadog, Splunk, or CloudWatch Logs. Combined with TF_LOG_MASK (Terraform 1.10+), you can mask sensitive values while keeping structured logging:

# Mask sensitive values in logs (Terraform 1.10+)
export TF_LOG=JSON
export TF_LOG_MASK="password,secret,token,key"
terraform apply

Debugging Provider Logs

AWS SDK logs live in TF_LOG_PROVIDER. Same levels as TF_LOG:

export TF_LOG_PROVIDER=TRACE
terraform plan

This shows the actual API calls Terraform makes to AWS — every DescribeInstances, CreateVolume, and AuthorizeSecurityGroupIngress. Provider logs are where you find “resource not found” errors, IAM permission denials, and API rate limit errors.

Debugging Core Logs

For core-only debugging (validation, state operations):

export TF_LOG_CORE=TRACE
terraform plan

Set both TF_LOG and TF_LOG_CORE separately for full visibility into both the Terraform core and provider behavior simultaneously:

export TF_LOG=TRACE
export TF_LOG_CORE=/tmp/terraform-core.log
export TF_LOG_PATH=/tmp/terraform-debug.log
terraform apply

Machine-Readable Logs

Set TF_LOG=JSON for JSON-encoded output at TRACE level. Useful for parsing with tools:

export TF_LOG=JSON
terraform plan 2>&1 | jq .

The JSON format in Terraform 1.10+ is more stable and better documented than earlier versions.

State Inspection Commands

Before enabling verbose logging, check the state directly — most issues are visible in state:

# List all resources in state
terraform state list

# Filter resources by address
terraform state list -id aws_instance

# Show full resource attributes
terraform state show aws_instance.app

# Pull raw state JSON (for automation)
terraform state pull

# Rename a resource in state (use when refactoring)
terraform state mv aws_instance.old aws_instance.new

# Remove a deleted resource from state
terraform state rm aws_instance.deleted

# Import existing infrastructure
terraform import aws_instance.existing i-1234567890abcdef0

Warning: Never edit .tfstate files directly. The JSON format is internal and manual edits can corrupt state. Always use terraform state subcommands.

terraform console: Testing Expressions

The console command evaluates Terraform expressions interactively — useful for testing complex variable combinations:

terraform console

# Then test expressions:
> local.environment
"production"
> var.tags
tomap({"Environment" = "prod", "Team" = "platform"})
> lookup(var.region_map, "us-east-1", "us-west-2")
"us-west-2"
> formatdate("YYYY-MM-DD", timestamp())
"2026-04-05"

Terraform 1.8+ extended console with support for provider-configured functions and data sources, making it more useful for testing complex module inputs.

terraform graph: Visualizing the Dependency Graph

The graph command outputs DOT format. Pipe it to Graphviz to visualize the dependency graph:

terraform graph | dot -Tpng > graph.png

Cycles are a major source of plan/apply failures. Terraform 1.7+ supports --draw-cycles to highlight cycles directly in the graph output:

terraform graph --draw-cycles | dot -Tpng > graph-with-cycles.png

terraform test: Module Testing (GA since 1.6)

Terraform 1.6+ brought first-class module testing with terraform test. This is the right way to validate module behavior before deploying:

# tests/unit.tftest.hcl
run "test_vpc" {
  command = plan
  variables {
    cidr_block = "10.0.0.0/16"
    environment = "test"
  }
  assert {
    condition     = aws_vpc.test.cidr_block == "10.0.0.0/16"
    error_message = "VPC CIDR mismatch"
  }
}

run "test_subnets" {
  command = plan
  variables {
    cidr_block = "10.0.0.0/16"
  }
  assert {
    condition     = length(aws_subnet.public) == 2
    error_message = "Expected 2 public subnets"
  }
}

Run tests:

# Run all tests
terraform test

# Verbose output
terraform test -verbose

# Run specific test file
terraform test -filter=./tests/unit.tftest.hcl

Terraform 1.8+ added mock_provider and mock_data blocks, enabling unit testing of modules without real infrastructure:

# tests/mocked.tftest.hcl
mock_provider "aws" {}

mock_data "aws_caller_identity" "current" {
  defaults {
    account_id = "123456789012"
    arn        = "arn:aws:iam::123456789012:root"
    user_id    = "AROAEXAMPLE"
  }
}

run "test_with_mocks" {
  command = plan
  assert {
    condition     = aws_instance.test.ami != ""
    error_message = "Instance should have AMI set"
  }
}

State Encryption (Terraform 1.9+)

Terraform 1.7 introduced state encryption (alpha), and it became GA in 1.9. For any state containing sensitive values (passwords, API keys, certificates), enable encryption:

# terraform.tfbackend
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "env:/dev/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true          # Server-side encryption (AWS S3 SSE)
    state_encryption {
      algorithm = "aes-gcm"
      key = "base64-encoded-32-byte-key"  # Or use KMS key
    }
  }
}

State encryption protects .tfstate files that might be committed to version control or stored in shared locations. Never commit state files to Git, even encrypted ones — use remote state with encryption.

Importing Existing Infrastructure

If infrastructure already exists outside of Terraform, use terraform import:

# Import an existing EC2 instance
terraform import aws_instance.existing i-1234567890abcdef0

# Then run terraform plan to see what Terraform thinks should happen
terraform plan

The import brings the resource into state but doesn’t generate configuration. You’ll need to write the corresponding .tf resource block manually.

What Changed Recently (2024-2026)

  • Terraform 1.7 (2024): State encryption in alpha, provider debugging hooks, improved test framework
  • Terraform 1.8 (2024): Provider-configured functions in console, mock_provider and mock_data for testing
  • Terraform 1.9 (2025): State encryption GA, improved敏感 value handling
  • Terraform 1.10 (November 2024): Structured log format for TF_LOG, TF_LOG_MASK for masking sensitive values in logs
  • TFEnv and Tfswitch remain the standard tools for managing multiple Terraform versions on developer machines

Common Gotchas (2024-2026)

Editing state manually corrupts it. The .tfstate JSON is internal. Direct edits cause drift and corruption. Always use terraform state subcommands.

terraform refresh modifies state. In Terraform 1.x, terraform refresh is called automatically during terraform plan. Running it standalone can drift state without applying changes — use with caution.

State locking failures. If DynamoDB shows LockNotAcquiredException, another apply is in progress. Wait and retry. Never bypass locks — concurrent applies corrupt state.

Sensitive values in TRACE logs. TF_LOG=TRACE can expose secret values in plain text. Use TF_LOG_MASK (Terraform 1.10+) to mask sensitive field names.

Provider vs. core logs differ. Set both TF_LOG_CORE and TF_LOG_PROVIDER separately for complete visibility when debugging complex issues.

terraform console needs initialized providers. Always run terraform init before terraform console. In CI scripts, don’t assume the working directory is initialized.

Remote state needs locking. Without DynamoDB state locking on S3, concurrent applies can corrupt state. Always configure lock_table for production backends.

My Workflow

When something breaks:

  1. Run terraform validate first for quick config checks
  2. Enable TF_LOG=ERROR to see if the issue is obvious
  3. Bump to TF_LOG=TRACE if not, then search the log for ERROR or WARN lines
  4. Use TF_LOG_PROVIDER=TRACE when the problem is in the cloud (not found, access denied, etc.)
  5. Check terraform state list and terraform state show before enabling verbose logging
  6. For module behavior, write a terraform test — it’s faster than running apply in staging

Report bugs at Terraform’s GitHub repo with TF_LOG=TRACE output attached. Mask sensitive values with TF_LOG_MASK before sharing logs.

For more on Terraform, the posts on Terraform best practices and infrastructure as code cover the surrounding workflow and state management patterns. The Terraform and Ansible guide covers post-provisioning configuration, and Terraform modules covers organizing infrastructure code at scale.

Bits Lovers

Bits Lovers

Professional writer and blogger. Focus on Cloud Computing.

Comments

comments powered by Disqus