Terraform is the infrastructure as code offering from HashiCorp. It is a tool for building, changing, and managing infrastructure in a safe, repeatable way. Terraform is used to manage environments with a configuration language called the HashiCorp Configuration Language (HCL) for human-readable, automated deployments.
Infrastructure as code is the process of managing infrastructure in a file or files, inside repositories just like we are maintaining for our code rather than manually configuring resources in a user interface. A resource might be any piece of infrastructure in a given environment, such as a virtual machine, security group, network interface, etc. At a high level, it allows us to use HCL to author files containing definitions of their desired infrastructure/environments on almost any provider (AWS, Google Cloud, GitHub, Docker, etc.) and automates the creation/updation/deletion of any infrastructure at the time of apply.
Refer to this repository for the terraform code used in this article.
Workflow
A simple workflow for deployment will follow closely to the steps below:
- Scope – Confirm what resources need to be created for a given project.
- Author – Create the configuration file in HCL based on the scoped parameters.
- Initialize – Run
terraform init
in the project directory with the configuration files. This will download the correct provider plug-ins for the project. - Plan – Run
terraform plan
to verify creation process - Apply – Then
terraform apply
to create real resources as well as the state file that compares future changes in your configuration files to what actually exists in your deployment environment.
Let’s start building some infrastructure
I’m assuming you use Google Cloud Shell, as it comes pre-installed with Terraform. Terraform recognizes files ending in .tf
or .tf.json
as configuration files and will load them when it runs.
Create a file named main.tf
.
Add following content in the file, either by opening the file in the editor from cloud shell or you can add by using nano editor.
terraform {
required_providers {
google = {
source = "hashicorp/google"
}
}
}
provider "google" {
version = "3.5.0"
project = "<PROJECT_ID>"
region = "us-central1"
zone = "us-central1-c"
}
resource "google_compute_network" "vpc_network" {
name = "terraform-network"
}
Replace <PROJECT_ID> with your GCP Project Id. Let’s under the above code.
Terraform Block
The terraform {}
block is required so Terraform knows which provider to download from the Terraform Registry. In the configuration above, the google
provider’s source is defined as hashicorp/google
which is shorthand for registry.terraform.io/hashicorp/google
.
You can also assign a version to each provider defined in the required_providers
block. The version
argument is optional, but recommended. It is used to constrain the provider to a specific version or a range of versions in order to prevent downloading a new provider that may possibly contain breaking changes. If the version isn’t specified, Terraform will automatically download the most recent provider during initialization.
To learn more, reference the provider source documentation.
Providers
The provider
block is used to configure the named provider, in this case google
. A provider is responsible for creating and managing resources. Multiple provider blocks can exist if a Terraform configuration manages resources from different providers.
Initialization
The first command to run for a new configuration or after checking out an existing configuration from version control is terraform init
, which initializes various local settings and data that will be used by subsequent commands.
You can initialize Terraform configuration by running the terraform init
command in the same directory as your main.tf
file:
terraform init
Let’s create the resource
terraform apply
The output has a +
next to resource "google_compute_network" "vpc_network"
, meaning that Terraform will create this resource. Beneath that, it shows the attributes that will be set. When the value displayed is (known after apply)
, it means that the value won’t be known until the resource is created.
If the plan was created successfully, Terraform will now pause and wait for approval before proceeding. If anything in the plan seems incorrect or dangerous, it is safe to abort here with no changes made to your infrastructure.
Type yes
at the confirmation prompt to proceed.
If you check VPC network in GCP, the network is provisioned now.
Run the terraform show
command to inspect the current state. These values can be referenced to configure other resources or outputs.
Update Infrastructure
After creating basic infrastructure (VPC network), let’s modify the configuration and update the infrastructure.
As usual infrastructure is continuously evolving, Terraform was built to help manage and enact that change. Terraform will compare changes with the real infrastructure by preparing execution plan and do only changes require. It won’t destroy entire infrastructure and re-create it, but will upgrade infrastructure with necessary changes.
By using Terraform to change infrastructure, you can version control not only your configurations but also your state so you can see how the infrastructure evolves over time.
Adding a compute resource
Let’s add compute resource to the main.tf
. Add the following block of code to the file.
resource "google_compute_instance" "vm_instance" {
name = "terraform-instance"
machine_type = "f1-micro"
boot_disk {
initialize_params {
image = "debian-cloud/debian-9"
}
}
network_interface {
network = google_compute_network.vpc_network.name
access_config {
}
}
}
This resource includes a few more arguments. The name and machine type are simple strings, but boot_disk
and network_interface
are more complex blocks. You can see all of the available options in the documentation.
For this example, the compute instance will use a Debian operating system, and will be connected to the VPC Network we created earlier. Notice how this configuration refers to the network’s name property with google_compute_network.vpc_network.name
— google_compute_network.vpc_network
is the ID, matching the values in the block that defines the network, and name
is a property of that resource.
The presence of the access_config
block, even without any arguments, ensures that the instance will be accessible over the internet.
Let’s run terraform apply
to create the compute instance:
terraform apply
Answer yes
to the confirmation prompt, and check the resource is created in the Google Cloud.
Updating a compute resource
Let’s add a tags
argument to your “vm_instance” so that it looks like this:
resource "google_compute_instance" "vm_instance" {
name = "terraform-instance"
machine_type = "f1-micro"
tags = ["web", "dev"]
...
}
Run terraform apply
again to update the instance:
terraform apply
You can see prefix ~
means that Terraform will update the resource in-place. You can go and apply this change now by responding yes
, and Terraform will add the tags to your instance.
Changes that recreate the infrastructure
Some configuration changes are required to re-create the resource, because the cloud provider doesn’t support updating the resource in the way described by your configuration.
For example, changing the disk image of your compute instance where we are updating Debian from version 9 to 11:
boot_disk {
initialize_params {
image = "debian-cloud/debian-11"
}
}
Again, run terraform apply
again to see how Terraform will apply this change to the existing resources:
terraform apply --auto-approve
The --auto-approve
switch will automatically approve the apply command, and you won’t get confirmation prompt.
The prefix -/+
means that Terraform will destroy and recreate the resource, rather than updating it in-place. While some attributes can be updated in-place (which are shown with the ~
prefix), changing the boot disk image for an instance requires recreating it. Terraform and the Google Cloud provider handle these details for you, and the execution plan makes it clear what Terraform will do.
Additionally, the execution plan shows that the disk image change is what required your instance to be replaced. Using this information, you can adjust your changes to possibly avoid destroy/create updates if they are not acceptable in some situations.
Terraform first destroyed the existing instance and then created a new one in its place. You can use terraform show
again to see the new values associated with this instance.
Resource Dependencies
In real-world infrastructure contains many resources, having dependencies. Terraform configuration can contain multiple resources, resources types from multiple providers.
Add the following section to main.tf file:
resource "google_compute_address" "vm_static_ip" {
name = "terraform-static-ip"
}
And update the network_interface
configuration:
network_interface {
network = google_compute_network.vpc_network.self_link
access_config {
nat_ip = google_compute_address.vm_static_ip.address
}
}
The access_config
block has several optional arguments, and in this case you’ll set nat_ip
to be the static IP address. When Terraform reads this configuration, it will:
- Ensure that
vm_static_ip
is created beforevm_instance
- Save the properties of
vm_static_ip
in the state - Set
nat_ip
to the value of thevm_static_ip.address
property
Run the following command:
terraform plan -out static_ip
Saving the plan this way ensures that you can apply exactly the same plan in the future. If you try to apply the file created by the plan, Terraform will first check to make sure the exact same set of changes will be made before applying the plan.
terraform apply "static_ip"
Implicit and Explicit Dependencies
Terraform can automatically infer when one resource depends on another. The reference to google_compute_address.vm_static_ip.address
creates an implicit dependency on the google_compute_address
named vm_static_ip
.
Terraform uses this dependency information to determine the correct order in which to create and update different resources. In the example above, Terraform knows that the vm_static_ip
must be created before the vm_instance
is updated to use it.
Sometimes there are dependencies between resources that are not visible to Terraform. The depends_on
argument can be added to any resource and accepts a list of resources to create explicit dependencies for.
For example, if your application configured on the virtual machine uses Google Cloud Storage bucket the dependency is not visible to the Terraform. Since, the dependency is defined in your application you should also tell Terraform about the dependency by defining it explicitely.
# New resource for the storage bucket our application will use.
resource "google_storage_bucket" "example_bucket" {
name = "<UNIQUE-BUCKET-NAME>"
location = "US"
website {
main_page_suffix = "index.html"
not_found_page = "404.html"
}
}
# Create a new instance that uses the bucket
resource "google_compute_instance" "another_instance" {
# Tells Terraform that this VM instance must be created only after the
# storage bucket has been created.
depends_on = [google_storage_bucket.example_bucket]
name = "terraform-instance-2"
machine_type = "f1-micro"
boot_disk {
initialize_params {
image = "cos-cloud/cos-stable"
}
}
network_interface {
network = google_compute_network.vpc_network.self_link
access_config {
}
}
}
Note: The order that resources are defined in a terraform configuration file has no effect on how Terraform applies your changes. Organize your configuration files in a way that makes the most sense for you and your team.
Now run terraform plan and terraform apply to see these changes in action:
terraform plan
terraform apply --auto-approve
Provisioners in Terraform
The virtual machine we have created based on the OS image has no additional software installed or configuration applied.
Note: Google Cloud allows customers to manage their own custom operating system images. This can be a great way to ensure the instances you provision with Terraform are pre-configured based on your needs. Packer is the perfect tool for this and includes a builder for Google Cloud.
Terraform uses provisioners to upload files, run shell scripts, or install and trigger other software like configuration management tools.
Let’s add creation time provisioner in our vm_instance
.
resource "google_compute_instance" "vm_instance" {
name = "terraform-instance"
machine_type = "f1-micro"
tags = ["web", "dev"]
provisioner "local-exec" {
command = "echo ${google_compute_instance.vm_instance.name}: ${google_compute_instance.vm_instance.network_interface[0].access_config[0].nat_ip} >> ip_address.txt"
}
...
}
Multiple provisioner
blocks can be added to define multiple provisioning steps. The local-exec
provisioner executes a command locally on the machine running Terraform, not the VM instance itself. We’re using this provisioner versus the others (like remote_exec
) so we don’t have to worry about specifying any connection info right now (in case we are using remote_exec
).
Each VM instance can have multiple network interfaces, so refer to the first one with network_interface[0]
, count starting from 0, as most programming languages do. Each network interface can have multiple access_config
blocks as well, so once again you specify the first one.
Terraform treats provisioners differently from other arguments. Provisioners only run when a resource is created, but adding a provisioner does not force that resource to be destroyed and recreated.
Tainted resource
In order to tell Terraform to recreate the resource there are two options. Either you can destroy the resource manually (or remove it from your configuration and apply it) or you can taint the resource.
terraform taint google_compute_instance.vm_instance
A tainted resource will be destroyed and recreated during the next apply
.
terraform apply --auto-approve
The ip_address.txt
file is created on the cloud shell machine now from where we are running the terraform.
Failed Provisioners
If a resource is successfully created but fails a provisioning step, Terraform will error and mark the resource as tainted. A resource that is tainted still exists, but shouldn’t be considered safe to use, since provisioning failed.
When you generate your next execution plan, Terraform will remove any tainted resources and create new resources, attempting to provision them again after creation.
Destroy Provisioners
Provisioners can also be defined that run only during a destroy operation. These are useful for performing system cleanup, extracting data, etc.
For many resources, using built-in cleanup mechanisms is recommended, if possible (such as init scripts), but provisioners can be used if necessary.
This lab won’t show any destroy provisioner examples. If you need to use destroy provisioners, please see the provisioner documentation.
Destroy Infrastructure
Resources can be destroyed using the terraform destroy
command, which is similar to terraform apply
but it behaves as if all of the resources have been removed from the configuration.
Just run the command:
terraform destroy
The -
prefix indicates that the instance and the network will be destroyed. As with apply, Terraform shows its execution plan and waits for approval before making any changes.
Just like with terraform apply
, Terraform determines the order in which things must be destroyed. Google Cloud won’t allow a VPC network to be deleted if there are resources still in it, so Terraform waits until the instance is destroyed before destroying the network. When performing operations, Terraform creates a dependency graph to determine the correct order of operations. In more complicated cases with multiple resources, Terraform will perform operations in parallel when it’s safe to do so.
Conclusion
Let me know your queries and feedback in the comments. In the next part I will cover more topics like terraform modules, terraform states and how we can import current state of existing infrastructure into terraform state.