Terraform Part 12 — Kubernetes and Helm Providers

Terraform for Kubernetes Too?
kubernetes Provider Basics
Creating Kubernetes Resources
kubernetes_manifest — Handling Arbitrary CRDs
helm Provider — Deploying Helm Charts
Cluster Bootstrap — The Ideal Use Case
Terraform vs ArgoCD — What to Assign Where
Limitations of Managing Kubernetes with Terraform
Practical Configuration Example
Common Problems and Solutions

Terraform for Kubernetes Too?

Terraform is a tool for building cloud infrastructure. Using it to create an EKS cluster should be familiar. But you can also use Terraform to create Deployments inside that cluster or deploy Helm charts.

“Why use Terraform for that? There’s kubectl and ArgoCD.” Fair question. To answer upfront, using Terraform to manage Kubernetes is not always the right choice. But there are situations where it clearly fits. Drawing that boundary precisely is the goal of this part.

flowchart TB
    subgraph "Two Approaches"
        direction LR
        TF["Terraform"]
        ArgoCD["ArgoCD / Flux\n(GitOps)"]
    end

    TF --> TFCase["As part of infrastructure:\ncluster bootstrap,\nstatic resources"]
    ArgoCD --> GitOpsCase["App deployment workflows,\nfrequently changing manifests"]

kubernetes Provider Basics

First, declare the provider and set up authentication.

terraform {
  required_providers {
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~> 2.30"
    }
  }
}

provider "kubernetes" {
  # Option 1: Use kubeconfig file
  config_path    = "~/.kube/config"
  config_context = "my-cluster"
}

This accesses the cluster through the local kubeconfig. It’s fine when humans run apply manually, but CI requires a different approach.

The common pattern for creating an EKS cluster with Terraform and immediately deploying resources to it:

data "aws_eks_cluster" "cluster" {
  name = module.eks.cluster_name
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_name
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.cluster.token
}

It retrieves the EKS API endpoint and CA certificate and injects them into the provider. aws_eks_cluster_auth is a data source that issues a new token on each execution, so there’s no worry about token expiration.

Creating Kubernetes Resources

Now let’s create resources in the cluster.

resource "kubernetes_namespace" "monitoring" {
  metadata {
    name = "monitoring"

    labels = {
      "pod-security.kubernetes.io/enforce" = "baseline"
    }
  }
}

resource "kubernetes_config_map" "app_config" {
  metadata {
    name      = "app-config"
    namespace = kubernetes_namespace.monitoring.metadata[0].name
  }

  data = {
    "config.yaml" = yamlencode({
      server = {
        port = 8080
        host = "0.0.0.0"
      }
      features = {
        caching = true
      }
    })
  }
}

resource "kubernetes_deployment" "nginx" {
  metadata {
    name      = "nginx"
    namespace = kubernetes_namespace.monitoring.metadata[0].name
  }

  spec {
    replicas = 3

    selector {
      match_labels = {
        app = "nginx"
      }
    }

    template {
      metadata {
        labels = {
          app = "nginx"
        }
      }

      spec {
        container {
          name  = "nginx"
          image = "nginx:1.25"

          port {
            container_port = 80
          }

          resources {
            limits = {
              cpu    = "500m"
              memory = "256Mi"
            }
            requests = {
              cpu    = "100m"
              memory = "128Mi"
            }
          }
        }
      }
    }
  }
}

Aside from the syntax being HCL, it corresponds almost 1:1 to Kubernetes YAML. It’s long due to many nested blocks, but the structure is clear.

kubernetes_manifest — Handling Arbitrary CRDs

Resource types like kubernetes_deployment and kubernetes_service are predefined by the provider. Custom Resources like Istio’s VirtualService or Argo Rollouts’ Rollout may not have a dedicated provider resource.

In such cases, kubernetes_manifest lets you pass any manifest as-is.

resource "kubernetes_manifest" "virtual_service" {
  manifest = yamldecode(file("${path.module}/manifests/virtualservice.yaml"))
}

Or inline:

resource "kubernetes_manifest" "prometheus_rule" {
  manifest = {
    apiVersion = "monitoring.coreos.com/v1"
    kind       = "PrometheusRule"

    metadata = {
      name      = "high-error-rate"
      namespace = "monitoring"
    }

    spec = {
      groups = [{
        name = "api.rules"
        rules = [{
          alert = "HighErrorRate"
          expr  = "rate(http_requests_total{status=~\"5..\"}[5m]) > 0.05"
          for   = "5m"
          labels = {
            severity = "warning"
          }
          annotations = {
            summary = "High error rate detected"
          }
        }]
      }]
    }
  }
}

Note: kubernetes_manifest requires the CRD to already be installed in the cluster at the first plan. If you handle the Helm chart that installs the CRD and this resource in the same Terraform run, ordering issues can arise. It’s safer to install CRDs first via a separate Terraform project or Helm.

helm Provider — Deploying Helm Charts

The helm provider lets you deploy Helm charts from Terraform.

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "~> 2.13"
    }
  }
}

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
    token                  = data.aws_eks_cluster_auth.cluster.token
  }
}

Similar authentication settings go inside the kubernetes block.

Now deploy a chart.

resource "helm_release" "ingress_nginx" {
  name       = "ingress-nginx"
  namespace  = "ingress-nginx"
  repository = "https://kubernetes.github.io/ingress-nginx"
  chart      = "ingress-nginx"
  version    = "4.11.0"

  create_namespace = true

  values = [
    yamlencode({
      controller = {
        replicaCount = 2
        service = {
          type = "LoadBalancer"
          annotations = {
            "service.beta.kubernetes.io/aws-load-balancer-type" = "nlb"
          }
        }
        resources = {
          requests = {
            cpu    = "100m"
            memory = "128Mi"
          }
        }
      }
    })
  ]
}

The values block can declare YAML inline with yamlencode. If you only want to override individual values, you can use set blocks instead.

resource "helm_release" "grafana" {
  name       = "grafana"
  namespace  = "monitoring"
  repository = "https://grafana.github.io/helm-charts"
  chart      = "grafana"
  version    = "7.3.0"

  set {
    name  = "adminPassword"
    value = var.grafana_admin_password
  }

  set {
    name  = "persistence.enabled"
    value = "true"
  }

  set {
    name  = "persistence.size"
    value = "10Gi"
  }
}

For overriding a few simple values, set is convenient. For extensive configuration, values is cleaner.

Cluster Bootstrap — The Ideal Use Case

There’s a case where using Terraform for Kubernetes resources is clearly the right call: cluster bootstrap.

Say you’ve just created a new EKS cluster. For this cluster to function properly, several base components need to be installed first.

OIDC Provider for IRSA (IAM Role for Service Account)
aws-load-balancer-controller
external-dns
cert-manager
metrics-server
ArgoCD itself(!)

These components must move in lockstep with cluster creation. They’re needed immediately after EKS is provisioned. Even if you want to use ArgoCD, the question becomes: who installs ArgoCD itself?

flowchart LR
    TF["Terraform"] -->|"1. Create infra"| EKS["EKS Cluster"]
    TF -->|"2. Bootstrap"| Boot["aws-lb-controller\nexternal-dns\ncert-manager\nArgoCD"]
    Boot -->|"3. Delegate subsequent\ndeployments"| ArgoCD["ArgoCD manages\napp deployments"]

This stage is where Terraform is natural. These are things created once at the infrastructure layer, rarely change, and when they do change, it’s mostly version upgrades.

In code, it looks like this.

# 1. Create EKS cluster (using module)
module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"

  cluster_name    = "prod-eks"
  cluster_version = "1.29"
  # ... remaining configuration
}

# 2. OIDC provider for IRSA (auto-created by module)

# 3. Install aws-load-balancer-controller
resource "helm_release" "aws_load_balancer_controller" {
  name       = "aws-load-balancer-controller"
  namespace  = "kube-system"
  repository = "https://aws.github.io/eks-charts"
  chart      = "aws-load-balancer-controller"

  set {
    name  = "clusterName"
    value = module.eks.cluster_name
  }

  set {
    name  = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
    value = module.aws_lb_controller_irsa.iam_role_arn
  }

  depends_on = [module.eks]
}

# 4. Install ArgoCD
resource "helm_release" "argocd" {
  name             = "argocd"
  namespace        = "argocd"
  repository       = "https://argoproj.github.io/argo-helm"
  chart            = "argo-cd"
  version          = "6.7.0"
  create_namespace = true

  values = [file("${path.module}/argocd-values.yaml")]
}

# 5. ArgoCD bootstrap Application (App of Apps)
resource "kubernetes_manifest" "root_app" {
  manifest = yamldecode(file("${path.module}/root-app.yaml"))
  depends_on = [helm_release.argocd]
}

This is where Terraform’s job ends. After this, root-app.yaml points to the apps/ directory in Git, and ArgoCD takes over actual application deployments. Terraform lays the foundation and steps back.

Terraform vs ArgoCD — What to Assign Where

When deciding how to manage your cluster, the key question is: “How frequently does this resource change?”

flowchart TB
    Start["K8s resource to manage"] --> Freq{"Change frequency"}

    Freq -->|"Daily, weekly\n(app deploys, scaling)"| ArgoCD["ArgoCD / Flux\n(GitOps)"]
    Freq -->|"Quarterly, annually\n(cluster components)"| TF["Terraform"]
    Freq -->|"One-time setup\n(initial bootstrap)"| TF

    ArgoCD -.-> Reason1["Fast feedback,\nrollback,\ndeveloper-friendly"]
    TF -.-> Reason2["In sync with infra,\nstate-based management,\ndrift detection"]

Where Terraform fits

EKS/GKE/AKS clusters themselves
Cluster bootstrap components (ingress controller, cert-manager, CSI drivers, etc.)
IAM IRSA bindings
DNS records, load balancers, and other cloud-native coupled resources
CD tools themselves like ArgoCD/Flux

Where ArgoCD/Flux fits

Application Deployments, Services, Ingresses
ConfigMaps, Secrets
HorizontalPodAutoscaler
Per-namespace policies

Gray areas — use judgment

Common monitoring stacks like Prometheus, Grafana: Terraform if stable, ArgoCD if frequently modified
Nginx Ingress Controller: Usually Terraform (infrastructure layer)
cert-manager: Usually Terraform

Mixing these boundaries makes the management owner unclear. If both Terraform and ArgoCD manage a single Deployment, changes made by one side keep getting reverted by the other. The principle of one resource, one tool is crucial.

Limitations of Managing Kubernetes with Terraform

These are things you learn only after bumping into them in practice.

1) State size explosion

Managing hundreds of K8s resources with Terraform makes the state file enormous. terraform plan slows down and CI times increase.

2) Ad-hoc edits create drift

During incident response, emergency fixes with kubectl scale or kubectl edit cause the Terraform state and actual state to diverge. The next apply unintentionally reverts things.

3) Rollbacks are cumbersome

ArgoCD rolls back instantly via Git revert. To roll back with Terraform, you need to revert the code and run apply. This is burdensome in emergencies.

4) Developer accessibility

If developers handle app deployments directly, Terraform has a learning curve. ArgoCD’s UI or manifest PRs are more familiar.

Practical Configuration Example

Here’s a setup I frequently use.

infra-repo/
├── envs/prod/
│   ├── network/          # VPC, subnets (Terraform)
│   ├── cluster/          # EKS cluster (Terraform)
│   └── bootstrap/        # aws-lb-controller, ArgoCD, etc. (Terraform + Helm)
│
apps-repo/
├── apps/                 # ArgoCD Application manifests
└── charts/               # Per-app Helm charts or Kustomize

Terraform’s bootstrap/: Installs ArgoCD and applies ArgoCD’s root Application manifest
Root Application: Points to the apps/ directory in apps-repo
apps/ directory: Per-app Application manifests (ArgoCD auto-detects)

Terraform handles cluster-level management, ArgoCD handles application-level deployment. Roles are clearly divided.

Common Problems and Solutions

Problem: Cluster auth token expiration

In long-running CI, the token issued by aws_eks_cluster_auth may expire mid-run.

Solution: Switch to the exec plugin approach.

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
  }
}

exec fetches a new token via AWS CLI on demand. Safe even for long-running applies.

Problem: Plan fails because CRD doesn’t exist

kubernetes_manifest requires the CRD at plan time.

Solution: Separate CRD installation and CR creation into different Terraform projects, or install CRDs together via Helm chart.

Problem: Sensitive information in Helm values

Strings placed in helm_release’s values are stored in state. Plaintext passwords end up in state.

Solution: Separate into Kubernetes Secrets or use External Secrets Operator. Only put Secret references in values.

Managing Kubernetes with Terraform isn’t a silver bullet. Use it only for things that change infrequently and are tightly coupled with the infrastructure layer, and delegate application deployments to GitOps tools. This combination works well for most teams. Drawing clear boundaries is the key to long-term operations.

In the next part, we’ll cover CI/CD integration. We’ll look at how to automate Terraform with GitHub Actions and Atlantis, along with approval workflows and secret management.

→ Part 13: CI/CD Integration