Terraform Data – What is and How to use it.

Bits Lovers
Written by Bits Lovers on
Terraform Data – What is and How to use it.

Terraform manages cloud infrastructure as code. You describe what you want, and it figures out how to make it happen. Like any programming language, Terraform has features that aren’t obvious at first glance. One of those is data sources, and they’re worth understanding because they come up constantly in real projects. If you’re looking for the newer built-in helper resource rather than data sources themselves, the 2026 terraform_data vs null_resource guide is the right follow-up.

What is a Data Source in Terraform?

A data source lets Terraform fetch information from somewhere outside your current configuration. That “somewhere” could be your cloud provider’s API, another Terraform workspace’s state file, or even a local file.

For example, if you need the list of availability zones available in your AWS account, you could hardcode them. Or you could use a data source to query AWS directly and get the current list. The second approach is better because zones get added and removed over time.

When Should You Use Data Sources?

The main reason to reach for a data source is when you find yourself hardcoding values that might change. Availability zones, AMI IDs, VPC IDs from another project – these are all good candidates.

Data sources also help when you’re splitting infrastructure across multiple Terraform workspaces or Terraform Cloud projects. Instead of copying output values around manually, you can reference them directly.

How to Use a Terraform Data Source

Let’s walk through a real example. Say you have a module that creates subnets, and you need to specify the availability zones for them.

The subnet module has an azs argument that takes a list of zones. A common mistake is hardcoding those zones:

module "subnet" {
  azs = ["us-west-1b", "us-west-1c"]
}

This works fine until someone changes the region variable. The zones are still hardcoded to us-west-1, but the region might now be us-east-1. That’s a mismatch that will fail at apply time.

Instead, use the aws_availability_zones data source:

data "aws_availability_zones" "available" {
  state = "available"
}

The state = "available" argument filters the results to only zones that are currently available. This data source is part of the AWS provider, and you can find its full documentation on the Terraform Registry.

Now reference it in your module:

module "vpc" {
  azs = data.aws_availability_zones.available.names
}

The data source returns a names attribute containing the list of zone names. No more hardcoded values.

Note: If your AWS account has Local Zones enabled, the aws_availability_zones data source will include those by default. To filter them out, use the filter argument:

data "aws_availability_zones" "available" {
  state = "available"
  filter {
    name   = "opt-in-status"
    values = ["opt-in-not-required"]
  }
}

Using terraform_remote_state

The terraform_remote_state data source reads output values from another Terraform state file. This is how you share data between separate projects or workspaces.

Here’s an example using a local state file:

data "terraform_remote_state" "bitslovers_vpc" {
  backend = "local"
  config = {
    path = "../terraform.tfstate"
  }
}

For real projects, you’ll typically store state in S3. This is the standard approach for teams with multiple DevOps engineers working on the same infrastructure:

data "terraform_remote_state" "bitslovers_vpc" {
  backend = "s3"
  config = {
    bucket = "bitslovers-remotestate"
    key    = "blog/us_east_1/dev/terraform.tfstate"
    region = "us-east-1"
  }
}

To use the outputs from that remote state:

resource "aws_instance" "bitslovers" {
  subnet_id = data.terraform_remote_state.bitslovers_vpc.outputs.subnet_id
}

Notice the .outputs. in the reference. In Terraform 0.12 and later, you access remote state outputs through data.terraform_remote_state.<name>.outputs.<attribute>. The older interpolation syntax (${data.terraform_remote_state...}) still works but the direct reference is cleaner and preferred.

Data Source Read Timing

Data sources are read during the plan phase, not during a separate refresh step. This was a change introduced back in Terraform 0.13, and it’s important to understand.

If a data source’s arguments don’t depend on any computed values (values that won’t be known until apply time), Terraform reads it during plan. The retrieved values appear in the plan output so you can see exactly what will happen.

If a data source’s arguments do reference computed values – say, an attribute of a resource that hasn’t been created yet – then Terraform can’t read it until the apply phase. In that case, the plan will show the data source attributes as (known after apply).

Using depends_on with Data Sources

Data sources support the depends_on meta-argument, just like managed resources:

data "aws_availability_zones" "available" {
  depends_on = [aws_subnet.example]
  state      = "available"
}

Adding depends_on delays the data source read until all listed dependencies have been applied. This is useful when the data you’re fetching depends on something being created first.

One thing to be aware of: if a data source argument directly references a managed resource, Terraform treats that as an implicit dependency. It will wait for the resource before reading the data. If you don’t want that behavior, pass the resource value through a locals block to break the implicit chain.

Data Sources vs Variables vs Locals

When you’re new to Terraform, it’s easy to confuse data sources with variables and locals. Here’s how I keep them straight:

  • Data sources fetch information from external systems via providers (like querying AWS for availability zones).
  • Variables are inputs passed into your module from the caller.
  • Locals are computed values within your module – you derive them from variables, data sources, or other expressions.

Boost your Terraform Skills

Bits Lovers

Bits Lovers

Professional writer and blogger. Focus on Cloud Computing.

Comments

comments powered by Disqus