
debug:terraform
Debug Terraform infrastructure-as-code issues systematically. This skill helps diagnose and resolve Terraform-specific problems including state lock conflicts, provider authentication failures, resource dependency cycles, state drift detection, import failures, module version conflicts, and plan/apply errors. Provides TF_LOG debugging, terraform console usage, state manipulation commands, and CI/CD best practices for infrastructure automation.
Debug Terraform infrastructure-as-code issues systematically. This skill helps diagnose and resolve Terraform-specific problems including state lock conflicts, provider authentication failures, resource dependency cycles, state drift detection, import failures, module version conflicts, and plan/apply errors. Provides TF_LOG debugging, terraform console usage, state manipulation commands, and CI/CD best practices for infrastructure automation.
Terraform Debugging Guide
Debug Terraform configurations and state issues using a systematic four-phase approach with provider-specific considerations.
Error Classification
Terraform errors fall into four categories:
- Language Errors - Syntax and configuration issues
- State Errors - State file corruption, drift, or lock issues
- Core Errors - Terraform engine problems
- Provider Errors - Cloud provider authentication or API issues
Phase 1: Reproduce and Isolate
Gather Initial Information
# Check Terraform and provider versions
terraform version
# Validate configuration syntax
terraform validate
# Review current state
terraform state list
terraform state show <resource_address>
# Check for state drift
terraform plan -refresh-only
Enable Debug Logging
# Set log level (TRACE, DEBUG, INFO, WARN, ERROR)
export TF_LOG=DEBUG
# Write logs to file (recommended for large outputs)
export TF_LOG_PATH="./terraform-debug.log"
# For provider-specific debugging
export TF_LOG_PROVIDER=DEBUG
# Run command with logging
terraform plan
Log Levels:
TRACE- Maximum verbosity, every action loggedDEBUG- Detailed debugging for complex issuesINFO- General informative messagesWARN- Non-critical warningsERROR- Critical errors only
Isolate the Problem
# Target specific resource
terraform plan -target=aws_instance.example
# Check specific module
terraform plan -target=module.networking
# Validate single file
terraform validate -json
Phase 2: Analyze Root Cause
Common Error Patterns and Solutions
State Lock Errors
Error: Error acquiring the state lock
Diagnosis:
# Check if another process is running
ps aux | grep terraform
# View lock info (for S3 backend)
aws dynamodb get-item --table-name terraform-locks --key '{"LockID":{"S":"your-state-path"}}'
Solution:
# Force unlock (use with caution!)
terraform force-unlock <LOCK_ID>
# For S3/DynamoDB backend
aws dynamodb delete-item --table-name terraform-locks --key '{"LockID":{"S":"your-state-path"}}'
Provider Authentication Failures
Error: error configuring Terraform AWS Provider: no valid credential sources found
Diagnosis:
# Check AWS credentials
aws sts get-caller-identity
# Verify environment variables
env | grep AWS
# Check credential file
cat ~/.aws/credentials
Solution:
# Explicit provider configuration
provider "aws" {
region = "us-east-1"
profile = "my-profile"
# Or use assume_role
assume_role {
role_arn = "arn:aws:iam::123456789012:role/TerraformRole"
}
}
Resource Dependency Cycles
Error: Cycle: aws_security_group.a, aws_security_group.b
Diagnosis:
# Generate dependency graph
terraform graph | dot -Tpng > graph.png
# Or view in text format
terraform graph
Solution:
# Break cycle with explicit dependencies
resource "aws_security_group" "a" {
name = "sg-a"
# Remove circular reference
}
resource "aws_security_group_rule" "a_to_b" {
security_group_id = aws_security_group.a.id
source_security_group_id = aws_security_group.b.id
# ...
}
State Drift
Note: Objects have changed outside of Terraform
Diagnosis:
# Detect drift
terraform plan -refresh-only
# Show current state
terraform show
# Pull remote state for inspection
terraform state pull > state.json
Solution:
# Accept external changes into state
terraform apply -refresh-only
# Or reimport drifted resource
terraform import aws_instance.example i-1234567890abcdef0
# Or replace resource to match config
terraform apply -replace=aws_instance.example
Import Failures
Error: Cannot import non-existent remote object
Diagnosis:
# Verify resource exists
aws ec2 describe-instances --instance-ids i-1234567890abcdef0
# Check import syntax for resource type
terraform providers schema -json | jq '.provider_schemas["registry.terraform.io/hashicorp/aws"].resource_schemas["aws_instance"]'
Solution:
# Correct import command
terraform import aws_instance.example i-1234567890abcdef0
# For modules
terraform import module.ec2.aws_instance.example i-1234567890abcdef0
# Generate import blocks (Terraform 1.5+)
terraform plan -generate-config-out=generated.tf
Module Version Conflicts
Error: Module version requirements have changed
Diagnosis:
# Check current module versions
terraform providers lock -platform=linux_amd64
# View dependency tree
cat .terraform.lock.hcl
Solution:
# Upgrade modules
terraform init -upgrade
# Clear cache and reinitialize
rm -rf .terraform
terraform init
# Pin specific version
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
}
Resource Already Exists
Error: Error creating S3 bucket: BucketAlreadyExists
Solution:
# Import existing resource
terraform import aws_s3_bucket.example my-bucket-name
# Or use data source to reference
data "aws_s3_bucket" "existing" {
bucket = "my-bucket-name"
}
Phase 3: Fix and Verify
Interactive Debugging
# Open Terraform console for expression testing
terraform console
# Test expressions
> var.instance_type
> aws_instance.example.id
> length(var.subnets)
> jsonencode(local.tags)
State Manipulation (Use with Caution)
# Remove resource from state (doesn't destroy)
terraform state rm aws_instance.orphaned
# Move resource in state
terraform state mv aws_instance.old aws_instance.new
# Move to different state file
terraform state mv -state-out=other.tfstate aws_instance.example aws_instance.example
# Replace provider in state
terraform state replace-provider hashicorp/aws registry.terraform.io/hashicorp/aws
Safe Apply Strategies
# Preview changes
terraform plan -out=tfplan
# Apply with plan file (recommended)
terraform apply tfplan
# Apply specific resource only
terraform apply -target=aws_instance.example
# Destroy and recreate
terraform apply -replace=aws_instance.problematic
Phase 4: Document and Prevent
Pre-Commit Validation
# Validate syntax
terraform validate
# Format check
terraform fmt -check -recursive
# Use tflint for best practices
tflint --init
tflint
# Security scanning with checkov
checkov -d .
# Cost estimation
infracost breakdown --path=.
CI/CD Best Practices
# CI-friendly commands
terraform init -input=false
terraform plan -input=false -no-color -out=tfplan
terraform apply -input=false -no-color tfplan
# Lock provider versions
terraform providers lock -platform=linux_amd64 -platform=darwin_amd64
Configuration Best Practices
# Pin Terraform version
terraform {
required_version = ">= 1.5.0, < 2.0.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
# Use variables with validation
variable "environment" {
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
# Add lifecycle rules for critical resources
resource "aws_instance" "critical" {
# ...
lifecycle {
prevent_destroy = true
create_before_destroy = true
}
}
Quick Reference Commands
| Command | Purpose |
|---|---|
terraform validate |
Check syntax and configuration |
terraform plan |
Preview changes |
terraform plan -refresh-only |
Detect state drift |
terraform state list |
List all resources in state |
terraform state show <addr> |
Show resource details |
terraform state pull |
Download remote state |
terraform state rm <addr> |
Remove from state |
terraform import <addr> <id> |
Import existing resource |
terraform force-unlock <id> |
Release state lock |
terraform console |
Interactive expression testing |
terraform graph |
Generate dependency graph |
terraform providers |
Show required providers |
terraform output |
Show output values |
terraform taint <addr> |
Mark for recreation (deprecated) |
terraform apply -replace=<addr> |
Force resource replacement |
Environment Variables
| Variable | Purpose |
|---|---|
TF_LOG |
Set log level (TRACE/DEBUG/INFO/WARN/ERROR) |
TF_LOG_PATH |
Write logs to file |
TF_LOG_PROVIDER |
Provider-specific logging |
TF_INPUT |
Disable interactive prompts (0/false) |
TF_VAR_name |
Set variable value |
TF_CLI_ARGS |
Additional CLI arguments |
TF_DATA_DIR |
Custom data directory (default: .terraform) |
TF_WORKSPACE |
Set workspace |
TF_IN_AUTOMATION |
Adjust output for automation |
TF_PLUGIN_CACHE_DIR |
Share providers across projects |
Debugging Checklist
- [ ] Read the complete error message carefully
- [ ] Check Terraform and provider versions
- [ ] Run
terraform validatefor syntax issues - [ ] Enable debug logging with
TF_LOG=DEBUG - [ ] Check state with
terraform state list - [ ] Verify provider credentials and permissions
- [ ] Check for resource dependencies with
terraform graph - [ ] Review
.terraform.lock.hclfor version conflicts - [ ] Test expressions with
terraform console - [ ] Check cloud provider console for external changes
- [ ] Review recent changes in version control
- [ ] Search provider documentation for resource-specific issues
Security Reminders
- Never commit state files or
.tfvarswith secrets to version control - Sanitize debug logs before sharing (may contain credentials)
- Disable
TF_LOGin production to prevent sensitive data exposure - Use remote state with encryption for team environments
- Rotate credentials if exposed in logs
You Might Also Like
Related Skills

fix
Use when you have lint errors, formatting issues, or before committing code to ensure it passes CI.
facebook
frontend-testing
Generate Vitest + React Testing Library tests for Dify frontend components, hooks, and utilities. Triggers on testing, spec files, coverage, Vitest, RTL, unit tests, integration tests, or write/review test requests.
langgenius
frontend-code-review
Trigger when the user requests a review of frontend files (e.g., `.tsx`, `.ts`, `.js`). Support both pending-change reviews and focused file reviews while applying the checklist rules.
langgenius
code-reviewer
Use this skill to review code. It supports both local changes (staged or working tree) and remote Pull Requests (by ID or URL). It focuses on correctness, maintainability, and adherence to project standards.
google-gemini
session-logs
Search and analyze your own session logs (older/parent conversations) using jq.
moltbot
