Deploying Elasticsearch with Terraform to a virtual machine in Microsoft Azure

TL;DR

In case you want to go through the files yourself, you can find them in this GitHub repository. Everything you need can be found in the Terraform folder. Except for the GitHub Actions pipeline files, of course. Just be careful that it also includes code for deploying an Azure Function, which you might not be looking after.

Pre-requisites

This article assumes that you are familiar with provisioning infrastructures in Azure with Terraform. Also, we are picking up from where we left off in a previous article, so if you feel like any explanation is missing, you might want to take a look there.

Analyzing the Terraform file

Providers

We begin by adding the following 2 new providers, random and azapi. We won’t cover their functionality right now, as they will be used when generating the SSH key in a later step.

    random = {
      source  = "hashicorp/random"
      version = "~>3.0"
    }
    azapi = {
      source  = "azure/azapi"
      version = "~>1.5"
    }

Networking

We begin by providing a network for our soon-to-be-created Virtual Machine (VM). If you are familiar with networking, the properties should be pretty intuitive for you. If you are not, don’t worry too much about it right now, unless you are planning on deploying this for production, of course.

resource "azurerm_virtual_network" "network" {
  name                = "${var.prefix}network"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
}

resource "azurerm_subnet" "subnet" {
  name                 = "internal"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.network.name
  address_prefixes     = ["10.0.2.0/24"]
}

Next, we are defining a public IP address for it. We went for a public one so we can easily enable SSH access to it. If you are planning for the VM to be only accessed within Azure, then you can possibly skip this resource. The “allocation_method” key defines how the IP address will be allocated, and it accepts either “Static” or “Dynamic”. If you are unsure about what these values are, take a look here.

resource "azurerm_public_ip" "publicip" {
  name                = "${var.prefix}public-ip"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  allocation_method   = "Dynamic"
}

Now we will define the network security group (NSG), which is somewhat similar to a firewall, as you can check in this extensive comparison. In our case, we are deploying it so we can allow access to port 22, used for SSH access. Note that we are allowing any source address here, which should not be the case for production scenarios, unless you are trying to bait hackers or something.

resource "azurerm_network_security_group" "nsg" {
  name                = "nsg"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name

  security_rule {
    name                       = "SSH"
    priority                   = 1001
    direction                  = "Inbound"
    access                     = "Allow"
    protocol                   = "Tcp"
    source_port_range          = "*"
    destination_port_range     = "22"
    source_address_prefix      = "*"
    destination_address_prefix = "*"
  }
}

Moving on, we will create a network interface (NIC), that enables our VM to communicate with the internet, Azure, and on-premises resources. Inside the “ip_configuration” configuration, we are dynamically allocating our private IP address, but using the public IP address we provided earlier.

Then we create an azurerm_network_interface_security_group_association so our NIC can benefit from the NSG we provided earlier.

resource "azurerm_network_interface" "nic" {
  name                = "${var.prefix}nic"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name

  ip_configuration {
    name                          = "configuration"
    subnet_id                     = azurerm_subnet.subnet.id
    private_ip_address_allocation = "Dynamic"
    public_ip_address_id          = azurerm_public_ip.publicip.id
  }
}

resource "azurerm_network_interface_security_group_association" "nicnsg" {
  network_interface_id      = azurerm_network_interface.nic.id
  network_security_group_id = azurerm_network_security_group.nsg.id
}

Generating an SSH key

Here we generate a random SSH key to connect to our VM. We begin by providing a random_pet resource, in order to generate a unique identifier for the SSH key. Then with the use of the “azapi_resource” and “azapi_resource_action” resources, we can easily generate the SSH key. We are also defining an output for it, in case we want to store or do something with the SSH key, but that’s not obligatory.

resource "random_pet" "ssh_key_name" {
  prefix    = "ssh"
  separator = ""
}

resource "azapi_resource" "ssh_public_key" {
  type      = "Microsoft.Compute/sshPublicKeys@2022-11-01"
  name      = random_pet.ssh_key_name.id
  location  = azurerm_resource_group.rg.location
  parent_id = azurerm_resource_group.rg.id
}

resource "azapi_resource_action" "ssh_public_key_gen" {
  type        = "Microsoft.Compute/sshPublicKeys@2022-11-01"
  resource_id = azapi_resource.ssh_public_key.id
  action      = "generateKeyPair"
  method      = "POST"

  response_export_values = ["publicKey", "privateKey"]
}

output "key_data" {
  value = jsondecode(azapi_resource_action.ssh_public_key_gen.output).publicKey
}

Providing the Linux virtual machine

Finally, we shall write the terraform code which shall create our VM. The “admin_username” defines, as you can guess, the username of the default local administrator. “size” defines a bunch of configurations, such as CPU and RAM. You can find more about them here, as the list can be huge and complex.

In “network_interface_ids”, we simply link to our existing NIC. More than one can be specified, and the first Network Interface ID in this list will be the Primary Network Interface on the VM.

Next on “admin_ssh_key”, we have the “username” field, which defines the username for the created SSH Key, and also the “public_key” field that receives the key generated earlier.

The “os_disk” configuration refers to the disks used by the VM. The first configuration, “caching”, defines the Type of Caching that should be used for the Internal OS Disk. Its value can be either None, ReadOnly or ReadWrite. Then we have the “storage_account_type” configuration, that defines the type of storage Account which should back the Internal OS Disk. The possible values are Standard_LRS, StandardSSD_LRS, Premium_LRS, StandardSSD_ZRS and Premium_ZRS, and you can find more details here.

Lastly, we have the “source_image_reference” configuration, that defines the base OS for the VM. Each of its configurations doesn’t make much sense on their own, but you can find a list of all the possible images in Microsoft’s documentation.

resource "azurerm_linux_virtual_machine" "elasticvm" {
  name                = "${var.prefix}elastic-vm"
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
  admin_username      = var.adminusername
  size                = "Standard_F2"

  network_interface_ids = [
    azurerm_network_interface.nic.id,
  ]

  admin_ssh_key {
    username   = var.adminusername
    public_key = jsondecode(azapi_resource_action.ssh_public_key_gen.output).publicKey
  }

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "Canonical"
    offer     = "0001-com-ubuntu-server-focal"
    sku       = "20_04-lts"
    version   = "latest"
  }
}

Bootstrapping a VM in Azure

Now that our VM is defined, we need to somehow install Elasticsearch on it. One possible way to achieve it is through the use of the “azurerm_virtual_machine_extension” resource. We can point it to a VM through the “virtual_machine_id” configuration, and most of its logic is defined inside “protected_settings”. As you can see, it receives a script file, with the name defined in our “setupelasticfile” variable.

resource "azurerm_virtual_machine_extension" "setupelastic" {
  name                 = "setup-elastic"
  virtual_machine_id   = azurerm_linux_virtual_machine.elasticvm.id
  publisher            = "Microsoft.Azure.Extensions"
  type                 = "CustomScript"
  type_handler_version = "2.0"

  protected_settings = <<PROT
    {
        "script": "${base64encode(file(var.setupelasticfile))}"
    }
    PROT
}

variable setupelasticfile {
  type=string
  default = "yum.bash"
}

The bash script, in short, will install the whole ELK stack: Elasticsearch, Kibana and Logstash. That can customized according to your needs, of course.

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list
sudo apt update && sudo apt install -y openjdk-8-jre-headless
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
sudo apt update && sudo apt install -y elasticsearch kibana logstash
sudo systemctl start elasticsearch.service

And that’s it! Your terraform script is now ready to be applied.

3 Oct, 2023