Skip to content
ioob.dev
Go back

Terraform Part 11 — Workspaces and Environment Separation

· 6 min read
Terraform Series (11/15)
  1. Terraform Part 1 — What Is Terraform
  2. Terraform Part 2 — Installation and First Deploy
  3. Terraform Part 3 — HCL Syntax
  4. Terraform Part 4 — Variables and Outputs
  5. Terraform Part 5 — Providers
  6. Terraform Part 6 — Resources and Dependencies
  7. Terraform Part 7 — Data Sources and Import
  8. Terraform Part 8 — State Management
  9. Terraform Part 9 — Modules
  10. Terraform Part 10 — Loops and Conditionals
  11. Terraform Part 11 — Workspaces and Environment Separation
  12. Terraform Part 12 — Kubernetes and Helm Providers
  13. Terraform Part 13 — CI/CD Integration
  14. Terraform Part 14 — Testing and Policy
  15. Terraform Part 15 — Practical Patterns and Pitfalls
Table of contents

Table of contents

Why Environment Separation Is Hard

“Take the infrastructure deployed in dev and create the same thing in prod.” Sounds simple, but in practice it’s tricky. Prod has different sizing, different backup policies, and different access permissions. It’s not about creating the exact same thing twice — it’s about reproducing the same pattern with slightly different parameters.

Separating environments also means fully separating state. A mistake in dev must not affect prod. If they share the same state, one side’s work can break the other.

flowchart TB
    subgraph "Environment Separation Requirements"
        direction TB
        Sep1["Complete state separation"]
        Sep2["Allow variable/config differences"]
        Sep3["Access control separation"]
        Sep4["Differentiated approval processes"]
        Sep5["Minimize code duplication"]
    end

Terraform offers two solutions: workspaces and directory-based separation. Terragrunt exists as a complementary tool. Let’s look at all three in order.

terraform workspace

By default, Terraform operates in a workspace called default. Switching workspaces separates the state files.

# Check current workspace
terraform workspace show
# default

# Create new workspaces
terraform workspace new dev
terraform workspace new prod

# Switch workspace
terraform workspace select dev

# List
terraform workspace list
#   default
# * dev
#   prod

When using a remote backend (like S3), state file paths are automatically separated per workspace. For example, if your backend configuration is:

terraform {
  backend "s3" {
    bucket = "my-tfstate"
    key    = "app/terraform.tfstate"
    region = "ap-northeast-2"
  }
}

The actual S3 paths look like this.

s3://my-tfstate/
├── app/terraform.tfstate                    # default
└── env:/
    ├── dev/app/terraform.tfstate            # dev workspace
    └── prod/app/terraform.tfstate           # prod workspace

In code, you can reference the current workspace name via the terraform.workspace variable.

locals {
  environment = terraform.workspace

  instance_count = {
    dev     = 1
    staging = 2
    prod    = 5
  }[terraform.workspace]
}

resource "aws_instance" "app" {
  count         = local.instance_count
  instance_type = local.environment == "prod" ? "m5.large" : "t3.micro"

  tags = {
    Environment = local.environment
  }
}

At first glance, it looks clean. One set of code, just switch workspaces to get multiple environments. But in practice, this can become a trap.

Limitations of Workspaces

Even the official HashiCorp documentation recommends workspaces only for “lightweight separation for slightly different environments.” It implies they’re unsuitable for production separation. Here’s why.

1) Shared code means shared risk

You made a bad code change in the dev workspace. But this is code that prod also uses. It’s not a structure where you verify in dev and then move to prod — it’s a structure where you run the same code with different workspaces. It’s hard to add verification steps specifically for production.

2) Configuration files aren’t separated

Workspaces separate state, but share code. You use terraform.workspace to branch environment-specific settings, but as branches multiply, code becomes complex and readability drops.

resource "aws_db_instance" "db" {
  instance_class = {
    dev     = "db.t3.micro"
    staging = "db.t3.medium"
    prod    = "db.r5.xlarge"
  }[terraform.workspace]

  allocated_storage = {
    dev     = 20
    staging = 50
    prod    = 500
  }[terraform.workspace]

  backup_retention_period = {
    dev     = 1
    staging = 7
    prod    = 30
  }[terraform.workspace]

  # ... workspace branching for dozens of attributes
}

Even this much is already unpleasant to read. You could extract to variable files, but then it’s not much different from directory separation.

3) The backend itself can’t be separated

Workspaces only differ in key paths within the same backend. If you want to put the dev state in a different AWS account’s S3 bucket, workspaces can’t do that. For fully separated backends per environment, you need a different approach.

4) Error-prone

The current workspace isn’t shown in the prompt. If you ran terraform apply while actually in the prod workspace? Without automation, mistakes are easy to make.

Summary: Workspaces are suitable for short-lived experimental sandboxes within the same team or per-feature preview environments. They’re unsuitable for long-term dev/staging/prod separation.

Directory-Based Environment Separation

This is the far more common approach in practice. Environments are completely separated by directory.

infra/
├── modules/               # Reusable modules
│   ├── vpc/
│   ├── eks/
│   └── rds/
└── envs/
    ├── dev/
    │   ├── main.tf        # Module composition
    │   ├── variables.tf
    │   ├── terraform.tfvars
    │   └── backend.tf     # Dev state backend
    ├── staging/
    │   ├── main.tf
    │   ├── variables.tf
    │   ├── terraform.tfvars
    │   └── backend.tf     # Staging state backend
    └── prod/
        ├── main.tf
        ├── variables.tf
        ├── terraform.tfvars
        └── backend.tf     # Prod state backend

Each environment is an independent directory with an independent backend. Modules are shared, but per-environment settings live within each directory.

flowchart TB
    Modules["modules/\n(Reusable units)"]

    Dev["envs/dev/"]
    Staging["envs/staging/"]
    Prod["envs/prod/"]

    Modules --> Dev
    Modules --> Staging
    Modules --> Prod

    Dev -->|"terraform init/apply\n(from dev directory)"| DevState["dev state\n(dev S3 bucket)"]
    Staging -->|"terraform init/apply\n(from staging directory)"| StagingState["staging state\n(staging S3 bucket)"]
    Prod -->|"terraform init/apply\n(from prod directory)"| ProdState["prod state\n(prod S3 bucket)"]

Each environment’s main.tf composes modules like this.

# envs/prod/main.tf
module "vpc" {
  source = "../../modules/vpc"

  cidr_block  = "10.0.0.0/16"
  environment = "prod"
  azs         = ["ap-northeast-2a", "ap-northeast-2c", "ap-northeast-2d"]
}

module "eks" {
  source = "../../modules/eks"

  cluster_name = "prod-eks"
  vpc_id       = module.vpc.vpc_id
  subnet_ids   = module.vpc.private_subnet_ids

  node_groups = {
    default = {
      instance_type = "m5.large"
      min_size      = 3
      max_size      = 10
      desired_size  = 5
    }
  }
}
# envs/dev/main.tf
module "vpc" {
  source = "../../modules/vpc"

  cidr_block  = "10.10.0.0/16"  # Different CIDR from prod
  environment = "dev"
  azs         = ["ap-northeast-2a", "ap-northeast-2c"]
}

module "eks" {
  source = "../../modules/eks"

  cluster_name = "dev-eks"
  vpc_id       = module.vpc.vpc_id
  subnet_ids   = module.vpc.private_subnet_ids

  node_groups = {
    default = {
      instance_type = "t3.medium"  # Smaller
      min_size      = 1
      max_size      = 3
      desired_size  = 1
    }
  }
}

Same modules, but different parameters per environment. Work is done from within each directory.

cd envs/dev
terraform init
terraform apply

# Switch to prod
cd ../prod
terraform init
terraform apply

Advantages

Disadvantages

Most production teams use this approach. The duplication issue is partially mitigated by modularization, but not perfectly. Terragrunt fills this gap.

A Brief Introduction to Terragrunt

Terragrunt is a Terraform wrapper made by Gruntwork. Its core philosophy is “use Terraform the DRY (Don’t Repeat Yourself) way.”

Terragrunt’s key idea is to declare repeated elements across environment directories (backend settings, providers, common variables) once and inherit them.

Here’s a typical Terragrunt project structure.

infra/
├── terragrunt.hcl              # Root config (global backend, provider)
├── _envcommon/                 # Common component definitions
│   ├── vpc.hcl
│   └── eks.hcl
└── envs/
    ├── dev/
    │   ├── env.hcl             # Dev global variables
    │   ├── vpc/
    │   │   └── terragrunt.hcl  # Inherits _envcommon/vpc.hcl
    │   └── eks/
    │       └── terragrunt.hcl
    └── prod/
        ├── env.hcl
        ├── vpc/
        │   └── terragrunt.hcl
        └── eks/
            └── terragrunt.hcl

Declare the backend once in the root terragrunt.hcl

# infra/terragrunt.hcl
remote_state {
  backend = "s3"

  config = {
    bucket         = "my-company-tfstate"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = "ap-northeast-2"
    encrypt        = true
    dynamodb_table = "my-company-tflock"
  }

  generate = {
    path      = "backend.tf"
    if_exists = "overwrite"
  }
}

${path_relative_to_include()} is the key. The relative path of each subdirectory automatically becomes the key. Running from envs/dev/vpc automatically sets the key to envs/dev/vpc/terraform.tfstate. Per-environment, per-component state separation comes for free.

Environment settings declared once

# envs/dev/env.hcl
locals {
  environment = "dev"
  aws_region  = "ap-northeast-2"
  account_id  = "111122223333"
}

Actual component declaration

# envs/dev/vpc/terragrunt.hcl
include "root" {
  path = find_in_parent_folders()
}

include "envcommon" {
  path = "${get_terragrunt_dir()}/../../../_envcommon/vpc.hcl"
}

locals {
  env = read_terragrunt_config(find_in_parent_folders("env.hcl")).locals
}

terraform {
  source = "../../../modules/vpc"
}

inputs = {
  cidr_block  = "10.10.0.0/16"
  environment = local.env.environment
}

The production VPC’s terragrunt.hcl is nearly identical, with only the CIDR in inputs being different.

Another powerful feature Terragrunt provides is automatic dependency management.

# envs/dev/eks/terragrunt.hcl
dependency "vpc" {
  config_path = "../vpc"
}

inputs = {
  vpc_id     = dependency.vpc.outputs.vpc_id
  subnet_ids = dependency.vpc.outputs.private_subnet_ids
}

The dependency block automatically retrieves outputs from another Terragrunt project. No need to manually copy VPC outputs or fetch them via data sources.

The run-all command

# Apply all sub-projects in dependency order
terragrunt run-all apply

# A specific environment only
cd envs/dev
terragrunt run-all apply

Terragrunt calculates the DAG and auto-applies in VPC → EKS → app order.

When to adopt Terragrunt

Strategy Comparison

Here’s a summary of the three strategies.

FactorworkspaceDirectory SeparationTerragrunt
State separationPath separation within same backendFully independent backendsFully independent backends
Multi-AWS accountDifficultPossiblePossible, easy
Code duplicationLowHigh (proportional to environments)Low
Backend configOnceRepeated per environmentOnce
Learning curveLowLowMedium
Dependency automationNoneNoneBuilt-in
Large-scale suitabilityLowMediumHigh

Practical recommendations

Whichever approach you choose, prod state must always be in a completely separate backend. Separating prod with workspaces is risky in the long term.

Combining with CI/CD

Environment separation is tightly coupled with CI/CD. A typical pipeline looks like this.

flowchart LR
    PR["PR Created"] --> Plan["terraform plan\n(all environments)"]
    Plan --> Review["Review & Approval"]
    Review --> Merge["Merge to main"]
    Merge --> Dev["dev auto-apply"]
    Dev --> Stg["staging auto-apply"]
    Stg --> Manual["prod manual approval"]
    Manual --> Prod["prod apply"]

At PR time, plan results for all environments are posted as comments. After merge, dev and staging auto-apply, while prod goes through a manual approval step. With directory-based separation, you run Terraform individually from each environment directory; with Terragrunt, run-all handles it all at once.

This pipeline is covered in detail in the next part (CI/CD integration).


Environment separation is a balancing point where “minimize code duplication” and “state independence” pull in opposite directions. Choose the right point based on team size and environment complexity. What’s certain is that prod must always have independent state.

In the next part, we’ll cover Kubernetes and Helm providers. We’ll also look at the criteria for deciding whether to manage cluster internals with Terraform or delegate to ArgoCD.

Part 12: Kubernetes and Helm Providers


Related Posts

Share this post on:

Comments

Loading comments...


Previous Post
Terraform Part 10 — Loops and Conditionals
Next Post
Terraform Part 12 — Kubernetes and Helm Providers