Table of contents
- The World Already Exists Outside Terraform
- data Blocks — Read-Only Resources
- terraform import — Incorporating Existing Resources
- moved — Changing Resource Addresses During Refactoring
- Practical Flow for Moving Console Resources to Terraform
- When to Use Querying vs Incorporation
- Moving Past the Basics
The World Already Exists Outside Terraform
The series so far has flowed from the premise of “building infrastructure from scratch with Terraform on a blank cloud.” But reality is different. When you join a company, there’s already a VPC someone created via the console, an S3 bucket someone manually set up that’s been running for three years, and the networking lives in a separate Terraform project managed by another team. Even when building a new system, it needs to connect to those existing assets.
This part covers two tools.
datablocks — Read existing resources for reference. Leave management as-is and just pull valuesterraform import— Bring existing resources under Terraform management. From then on, Terraform takes the lead
The difference lies in “ownership.” A data source says “I didn’t create it and I don’t need to delete it, just tell me the values,” while import says “I’m going to manage it from now on.”
flowchart LR
subgraph EXIST[Already Existing Resources]
R[VPC created via console<br/>Managed by another team]
end
subgraph ME[My Terraform]
D[data block<br/>Read only]
I[terraform import<br/>Bring under management]
end
R -->|Query| D
R -->|Incorporate| I
D -->|Use attributes| USE1[My resources]
I -->|Managed by Terraform going forward| TF[Recorded in tfstate]
data Blocks — Read-Only Resources
data looks similar to resource on the surface. It takes two labels (type and name) and receives conditions as arguments. The difference is that Terraform does not create, modify, or delete anything. During terraform apply, it simply calls the AWS API to find a resource matching the given conditions and retrieves its attributes.
AMI Lookup — The Most Common Pattern
This is a pattern we briefly saw in Part 2.
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["al2023-ami-*-x86_64"]
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t3.micro"
}
Instead of hardcoding the AMI ID like "ami-0c55b159cbfafe1f0", it finds “the latest al2023 published by Amazon” at runtime. When a new AMI is released, it automatically follows. The reference syntax is similar to resources, but note the data. prefix.
Default VPC/Subnet Lookup
Each region has a default VPC that AWS creates automatically. For simple experiments, you might want to use it as-is.
data "aws_vpc" "default" {
default = true
}
data "aws_subnets" "default" {
filter {
name = "vpc-id"
values = [data.aws_vpc.default.id]
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t3.micro"
subnet_id = tolist(data.aws_subnets.default.ids)[0]
}
aws_subnets (plural) returns a list, so tolist(...)[0] extracts the first subnet. This way you can place resources on top of the default network.
Referencing Resources Managed by Other Teams
This is a more important use case in practice. Suppose the network team follows a structure where “we manage VPCs with Terraform, and each team deploys services on top of them.” The service team needs the VPC ID but doesn’t manage it.
There are several approaches.
# 1. Look up by tags
data "aws_vpc" "platform" {
tags = {
Name = "platform-main"
Env = "prod"
}
}
# 2. Receive the ID as a variable
variable "vpc_id" {
type = string
}
data "aws_vpc" "platform" {
id = var.vpc_id
}
# 3. Read outputs from another Terraform State (remote state)
data "terraform_remote_state" "network" {
backend = "s3"
config = {
bucket = "myorg-terraform-state"
key = "network/prod/terraform.tfstate"
region = "ap-northeast-2"
}
}
# Usage
resource "aws_subnet" "my_service" {
vpc_id = data.terraform_remote_state.network.outputs.vpc_id
cidr_block = "10.0.100.0/24"
}
Approach 3 (terraform_remote_state) reads values that the network team exposed as outputs directly from our project. It’s clean but creates tight coupling (if the output structure of the remote state changes, our code breaks too). As organizations grow, it’s better for maintainability to either create a dedicated data source for network information lookup (e.g., store values in aws_ssm_parameter) or use tag-based lookups for loose coupling.
Account Information, Region, Availability Zones
Frequently used metadata can also be queried via data sources.
data "aws_caller_identity" "current" {}
data "aws_region" "current" {}
data "aws_availability_zones" "available" {
state = "available"
}
locals {
account_id = data.aws_caller_identity.current.account_id
region = data.aws_region.current.name
azs = data.aws_availability_zones.available.names
}
resource "aws_s3_bucket" "logs" {
# A unique name that includes account ID and region
bucket = "myapp-logs-${local.account_id}-${local.region}"
}
This prevents the mistake of hardcoding account IDs and makes it easier to reuse the same code across multiple regions.
Summary of Differences Between data and resource
flowchart TB
subgraph RES[resource]
RC[Create]
RU[Update]
RD[Delete]
RS[Read]
end
subgraph DATA[data]
DS2[Read only]
end
APPLY[terraform apply] --> RC
APPLY --> RU
APPLY --> RD
APPLY --> RS
APPLY --> DS2
DESTROY[terraform destroy] -.->|Affected| RD
DESTROY -.->|Not affected| DS2
Data blocks re-query the actual cloud’s current values on every apply. They are not cached. Because of this behavior, network calls can increase, so if dozens of data blocks are redundantly querying the same resource, consider refactoring to query once and store the result in locals.
terraform import — Incorporating Existing Resources
Now let’s go in the opposite direction. Bringing “resources Terraform doesn’t know about” — like S3 buckets or RDS instances created via the console — under Terraform management.
The purpose of import is clear. Register existing resources in the State without recreating them, so Terraform manages them going forward. Deleting and recreating a production DB would be disastrous, so this approach is essential.
The Old Way (CLI Command)
Up through Terraform 1.4, the process used CLI commands. This classic approach still works today.
# 1. Declare an empty resource in .tf file
cat > imported.tf <<EOF
resource "aws_s3_bucket" "legacy_logs" {
bucket = "legacy-logs-bucket"
}
EOF
# 2. Import into State
terraform import aws_s3_bucket.legacy_logs legacy-logs-bucket
# 3. Check diff between current configuration and actual state with plan
terraform plan
The import command format is terraform import <resource_address> <resource_ID>. The resource ID format varies by resource type (EC2 is i-xxx, S3 is the bucket name, RDS is the identifier). Check the “Import” section at the bottom of each resource’s official documentation.
The problem is that this approach only modifies the State without touching the code. After importing, running terraform plan produces a massive diff saying “my .tf file only has the bucket name, but the actual bucket has versioning, encryption, CORS settings, and everything else.” Transferring all of this to code one by one was incredibly tedious.
Declarative import Blocks (Terraform 1.5+)
So starting from 1.5, import blocks were added. In the declarative style, you write “I’m importing this resource to this address” in your code.
# imports.tf
import {
to = aws_s3_bucket.legacy_logs
id = "legacy-logs-bucket"
}
# Resource block is still required
resource "aws_s3_bucket" "legacy_logs" {
bucket = "legacy-logs-bucket"
# ... remaining configuration
}
Running terraform plan in this state shows Terraform’s plan to “import this resource, and here’s the diff between the current .tf and the actual state” all at once. Running apply handles the import automatically.
Even better is the code generation option.
terraform plan -generate-config-out=imported.tf
Terraform inspects the actual resource’s attributes and auto-generates .tf code. You read the generated code and keep only the necessary parts while cleaning up. The tedious manual transcription is greatly reduced.
flowchart LR
subgraph OLD[Old Approach]
C1[Write empty resource] --> C2[terraform import]
C2 --> C3[Check diff with plan]
C3 --> C4[Modify code to match reality]
end
subgraph NEW[1.5+ import Block]
N1[Write import block] --> N2["terraform plan<br/>-generate-config-out=..."]
N2 --> N3[Clean up auto-generated<br/>code]
N3 --> N4[terraform apply]
end
For new projects, use the 1.5+ approach. The legacy import documentation still works, but declarative import is much more Terraform-native.
Things to Watch Out for When Importing
Since import only fills in the State, there are a few pitfalls.
- Carefully review the diff between code and reality. If a plan shows dozens of attribute changes at once, it’s hard to predict what will be recreated. If you see
-/+, that means an unintended recreation. Fix the code to match reality first, then apply - Tags often mismatch. If you’ve set
default_tagson the Provider, existing resources won’t have those, so everything shows up as “add tag” changes. These are harmless but make the plan messy - Set
prevent_destroyon sensitive resources first. If you accidentally run destroy right after import, it’s catastrophic. When importing a production DB, start by addinglifecycle { prevent_destroy = true }to the resource - For resources inside modules, get the address right. Use the format
module.foo.aws_instance.bar
moved — Changing Resource Addresses During Refactoring
Another tool closely related to data sources and import is the moved block. Added in Terraform 1.1. Use it when you want to “change only the internal address without recreating the resource.”
This is a common situation in practice — moving a resource into a module during refactoring.
# Old structure
resource "aws_s3_bucket" "logs" {
bucket = "myapp-logs"
}
# New structure — moved to a module
module "logs" {
source = "./modules/bucket"
name = "myapp-logs"
}
If you make this change, Terraform will try to delete aws_s3_bucket.logs and create module.logs.aws_s3_bucket.this as new. The actual bucket gets deleted and recreated. In a production environment, this is an unacceptable change.
The solution is the moved block.
moved {
from = aws_s3_bucket.logs
to = module.logs.aws_s3_bucket.this
}
With this declaration, Terraform recognizes it as “just changing the address in the State internally.” It doesn’t touch the actual resource. Refactoring becomes safe.
Previously, this was done with CLI commands like terraform state mv, which was imperative and non-reproducible. moved blocks are declarative and tracked in Git, so team members can see the history.
Practical Flow for Moving Console Resources to Terraform
Combining the tools from this part, let’s say you get a task at work: “We have an RDS created via the console, and now we want to manage it with Terraform.” Here’s what the practical steps look like.
sequenceDiagram
participant Dev as Developer
participant TF as Terraform
participant AWS as AWS
Dev->>AWS: Investigate current RDS configuration (console, CLI)
Dev->>TF: Write import block + resource
Note over TF: lifecycle { prevent_destroy = true }
Dev->>TF: terraform plan -generate-config-out
TF-->>Dev: Auto-generated code
Dev->>TF: Clean up & review code
Dev->>TF: terraform plan (final)
TF-->>Dev: No changes / only safe changes
Dev->>TF: terraform apply
TF->>AWS: Register RDS in State
Note over Dev,AWS: Terraform manages it from here on
In words, the steps are as follows.
- Investigate: Identify the actual RDS engine, version, storage, VPC, security groups, and parameter groups
- Write: Create a corresponding
aws_db_instancein.tf, add animportblock, and includeprevent_destroyup front - Auto-generate: Use
terraform plan -generate-config-out=...to generate code from the current state - Clean up: Remove attributes from the generated code that Terraform can’t manage (like
status, which AWS determines), and add attributes you don’t want to manage toignore_changes - Verify: Confirm that
terraform planshows “No changes” or only acceptable changes - Apply: From this point on, Terraform takes ownership
It looks complex on the surface, but the 1.5+ approach is far less painful than before. If your company has a lot of console legacy, this process is worth doing as a team effort.
When to Use Querying vs Incorporation
Let’s wrap up with the selection criteria.
- You just need values, and someone else manages it →
datablock - You’ll manage it from now on →
terraform import(1.5+ declarative recommended) - Only the resource address changed, the actual resource is the same →
movedblock
When these three tools are combined, you can gradually gain Terraform’s benefits even for infrastructure that wasn’t created from scratch. This is especially valuable in environments where legacy and new code coexist.
Moving Past the Basics
Over seven parts, we’ve covered Terraform’s fundamentals. From IaC concepts through installation, HCL syntax, variables, providers, resources, and connecting with existing infrastructure. If you’ve made it this far, reading, modifying, and adding features to typical AWS/GCP infrastructure project code shouldn’t be difficult.
Starting from Part 8, we dive into practical operational topics. We’ll cover State management, modules, iteration patterns, environment separation, CI/CD integration, and testing and policies — tackling Terraform at scale.
In the next part, we’ll cover how to store Terraform State remotely and share it safely across a team.


Loading comments...