10 Terraform Best Practices for Optimizing Infrastructure-as-Code

Hi friend! Infrastructure-as-code (IaC) has revolutionized how teams provision and manage infrastructure in the cloud. And HashiCorp‘s Terraform has become one of the most widely used IaC tools for defining, deploying, and versioning cloud, hybrid, and on-premises architectures.

As adoption continues accelerating across industries, it‘s essential for practitioners to implement standards and best practices early on for managing infrastructure at scale.

In this comprehensive guide, I‘ll be covering 10 Terraform best practices that every engineer should follow for optimizing workflows, collaboration, productivity, and innovation as codebases grow.

Overview of 10 Best Practices

Here‘s a quick overview of the Terraform guidelines we‘ll cover:

  1. Directory structures for modular code
  2. Resource naming conventions
  3. Variable naming standards
  4. Leveraging community modules
  5. Upgrading Terraform regularly
  6. Backing up and locking state
  7. Using self variables
  8. Incrementally applying changes
  9. Managing secrets with var-files
  10. Local Docker development

Now let‘s explore each of these Terraform best practices in detail!

1. Logical Directory Structures

As infrastructure needs scale up from just a few files to hundreds of resources, modules, variables, and outputs, it becomes essential to architect Terraform codebases for understandability.

For small projects, you may be able to use a single main.tf file. But as configs grow in complexity, I recommend structured directories:

terraform_project
├── modules
│   ├── compute
│   ├── database
│   └── networking
├── dev
│   ├── main.tf
│   └── variables.tf
├── prod 
│   ├── main.tf
│   └── variables.tf
└── modules.tf

This project structure separates infrastructure into reusable components under modules/, isolates environment configs under dev/ and prod/, and imports modules in the root modules.tf file.

The benefits of modular Terraform code include:

  • Easier to manage discrete components
  • Changes limited to each module
  • Components can be open sourced and reused
  • Separates environments for customization

As seen in large open source Terraform frameworks like Terraform AWS Modules, proper code structure is foundational.

2. Resource Naming Conventions

Terraform styles official provider resources in the format <PROVIDER>_<TYPE>_<NAME>.

For example:

resource "aws_instance" "app_server" {
  #...
}

Some benefits of standardized naming:

  • Identify resources quickly in large codebases
  • Reduce confusion between variable names and resources
  • Maintain consistency across modules and config files

HashiCorp also recommends namespacing custom module resources with a project prefix like app_.

Adhering to naming conventions improves collaboration and understanding as codebases scale.

3. Variable Naming Consistency

Similarly, variables should follow logical naming schemes for identification throughout modules. The widely adopted standard is:

variable "var_<component>_<resource>_<attribute>" {
  default = ""
}

For example:

variable "var_vpc_cidr_block" {
  default = "10.0.0.0/16" 
}

Benefits include:

  • Eliminate confusion by using conventions
  • Identify usage across modules
  • Changes limited to specific variables

Consistency is key as codebases grow!

4. Leverage Community Modules

Before reinventing the wheel implementing network topologies, Kubernetes clusters, storage infrastructure and more, reference the public Terraform Module Registry.

With over 3,000 modules and growing daily, the registry saves immense hours by providing battle-tested infrastructure code compliant with best practices.

Additional benefits include:

  • Modules created by subject matter experts
  • Supports all major cloud providers
  • Reduces developmental overhead
  • Frequent updates with bug fixes and features

You can browse modules by cloud provider, popularity, verified status, and other filters to find relevant solutions. And contributing back to modules improves ecosystem quality over time.

5. Upgrade Terraform Regularly

Terraform evolves quickly, with major releases every 4-6 weeks delivering new functionality and improvements.

Terraform major releases by year

Terraform major releases by year [Src: Consul Upgrade Center]

Upgrading to the latest version ensures you leverage the newest features and bug fixes integrated by the Terraform community.

Delaying upgrades risks accruing technical debt, facing upgrade difficulties down the road, and missing out on innovations like policy-as-code, queryable state, enhanced modules, and more. I recommend upgrading within 1-2 weeks of new major releases.

6. Back Up and Lock State

Terraform state files track metadata and resource details on your infrastructure. Losing state makes Terraform unable to determine existing resources, requiring manual recreation.

I strongly recommend enabling remote backends to persist state in durable storage like S3:

backend "s3" {
  bucket   = "terraform-state-bucket"  
  key      = "network/terraform.tfstate"
  region   = "us-east-1"

  dynamodb_table = "terraform-locks"
} 

For team workflows, use state locking so that only one operation executes concurrently, preventing corruption.

DynamoDB is often used for locking state while writing to remote storage. Backups combine robustness for collaboration, disaster recovery and outages.

7. Leverage self Variables

Terraform configuration happens in phases, making some attribute values unavailable until apply.

self variables bypass this by providing resource attributes directly:

resource "aws_instance" "app" {
  ami           = "ami-047a51fa27710816e" 
  instance_type = "t2.micro"

  tags = {
    Name = "AppServerInstance"
  }
}

resource "null_resource" "configure" {
  provisioner "local-exec" {
    command = "echo ${self.app.private_ip} >> private_ips.txt"
  }
} 

Here we output the instance private IP without needing implicit resource references. Major cloud providers also provide data sources for querying resource details explicitly.

8. Incrementally Apply Changes

When applying changes, Terraform calculates the most efficient transition between current and target state. Execution plans outline the actions required to reach desired infrastructure.

In project workflows, I recommend an incremental approach via terraform apply, changing only one small module at a time. This minimizes the "blast radius" if any failures occur, while maintaining maximum service availability as changes roll out.

The incremental apply sequence should be:

  1. Local testing
  2. Stage/QA environments
  3. Production

Small changes reduce risk, and non-production verification gives higher confidence for production rollouts.

9. Manage Secrets with var-files

To prevent credentials like access keys or passwords from leaking into source control, use var-files for managing secret variables securely:

Create var-file:

access_key = "AKIAIOSFODNN7EXAMPLE" 
secret_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

Pass var-file during run:

terraform apply -var-file=secret.tfvars

And restrict permissions on the file itself:

chmod 400 secret.tfvars

This best practice avoids exposing credentials while keeping them accessible for Terraform. Var-files provide security and customizability by environment.

10. Build Locally with Docker

Lastly, I highly recommend standardizing your toolchain by using Docker containers for local development and testing:

docker run -it --rm -v "$(pwd):/work" hashicorp/terraform:1.3.7 apply

Docker allows packaging Terraform, providers, dependencies, and modules into portable images that simplify executions, eliminate tool version conflicts, and streamline CI/CD pipelines.

HashiCorp offers official images that you can extend to match production infrastructure specifications. Containers provide consistency across environments.

Implementing these 10 Terraform best practices will empower teams to deliver infrastructure rapidly and reliably as codebases scale.

Following standards around state management, upgrades, structure, testing, integration, and toolchain workflows dramatically improves productivity over time.

Now you‘re equipped with prescriptive guidance to optimize any Terraform project! Let me know which practice you find most valuable by tweeting me @jeffdgrant.