Workload Orchestration with Nomad

Nic

4 years ago

After getting your services to where they can find each other using Consul the next step is in deployment. There are a few options out there for deploying infrastructure. This guide will focus on Nomad by Hashicorp because of the integration with Consul and the ease of deployment.

What is Nomad?

Nomad is a workload orchestration tool created by Hashicorp. Workloads that can be deployed span from containers to scripts. This means that even applications that exist as jars, or shell scripts can be orchestrated by Nomad. This gives developers freedom in how they build an application. This also gives the ops side the ability to deploy workloads that have not been containerized.

Nomad also integrates with other Hashicorp offerings. It can connect with Consul to allow for easy and automatic service registration of workloads. It can also work with Vault for handling secrets management.

Nomad comes as a pre-compiled binary for most major operating systems. This makes the installation simple and almost identical to the install we did for Consul. This means the installation can easily be scripted by a state management tool like Ansible.

The heart of the workload orchestration with Nomad is the Job spec. Jobs are the high level grouping of a deployment. Jobs are further broken down into groups. Then the groups are broken down into tasks, which are the actual execution. This layered approach allows for easy grouping of components to match the architecture.

The Jobs are built out using HCL which is the Hashicorp Configuration Language. Once written the file can be passed to Nomad through the UI, the API, or the CLI. Similar to the other Hashicorp products, Nomad offers multiple ways to accomplish almost any task. This flexibility makes it easy to integrate into existing workflows.

The Job Specification

Writing Job specs is the most important facet of Nomad. To get the most from the tool you will need to be able to write the job to match what you want. To help with this Nomad has provided a large number of features and options around how the job is defined. I’ll quickly walk through some of the features using the example from Nomad.

The top level is the job declaration. There can only be one job in a file, and it is unique in a Nomad instance as well. In his top level section you can define region and datacenters where you want the workload to run. You can also set the type of the workload. The type determines how Nomad will treat the job when a task completes. A batch type stops on completion, where a service type will restart the task on completion. Another import item at the job level is the update setting which determines how many tasks will be updated simultaneously, and what rate to stagger them.

# This declares a job named "docs". There can be exactly one
# job declaration per job file.
job "docs" {
  # Specify this job should run in the region named "us". Regions
  # are defined by the Nomad servers' configuration.
  region = "us"
  
  # Spread the tasks in this job between us-west-1 and us-east-1.
  datacenters = ["us-west-1", "us-east-1"]
  
  # Run this job as a "service" type. Each job type has different
  # properties. See the documentation below for more examples.
  type = "service"
  
  # Specify this job to have rolling updates, two-at-a-time, with
  # 30 second intervals.
  update {
    stagger      = "30s"
    max_parallel = 2
  }

The next section is the Group. There can be multiple groups in a single job, but the names must be unique within a job. The first piece of information to specify in the job is the count, which is the number of duplicates of the group to deploy. Next is the networking information if needed. You can also define the service definition that will be sent to Consul at the group level. Additional available configurations can be found in the documentation and include things like constraints and meta data.

  # A group defines a series of tasks that should be co-located
  # on the same client (host). All tasks within a group will be
  # placed on the same host.
  group "webs" {
    # Specify the number of these tasks we want.
    count = 5

    network {
      # This requests a dynamic port named "http". This will
      # be something like "46283", but we refer to it via the
      # label "http".
      port "http" {}

      # This requests a static port on 443 on the host. This
      # will restrict the task to running once per host, since
      # there is only one port 443 on each host.
      port "https" {
        static = 443
      }
    }

    # The service block tells Nomad how to register this service
    # with Consul for service discovery and monitoring.
    service {
      # This tells Consul to monitor the service on the port
      # labelled "http". Since Nomad allocates high dynamic port
      # numbers, we use labels to refer to them.
      port = "http"

      check {
        type = "http"
        path = "/health"
        interval = "10s"
        timeout = "2s"
      }
    }

The final nested piece is the Task. The task is the individual unit of work to be accomplished. This can be the jar to run, the container to deploy, etc. The task starts declaring the driver that is going to be used. Nomad supports a large set of available drivers for the various workloads. Each driver has a unique config that is set in the task. It is also possible to pass environment variables that can be used by the task. Finally the resource allocation is set. This needs to be set based on either measured or expected usage.

    # Create an individual task (unit of work). This particular
    # task utilizes a Docker container to front a web application.
    task "frontend" {
      # Specify the driver to be "docker". Nomad supports
      # multiple drivers.
      driver = "docker"

      # Configuration is specific to each driver.
      config {
        image = "hashicorp/web-frontend"
      }

      # It is possible to set environment variables which will be
      # available to the task when it runs.
      env {
        DB_HOST = "db01.example.com"
        DB_USER = "web"
        DB_PASS = "loremipsum"
      }

      # Specify the maximum resources required to run the task,
      # include CPU and memory.
      resources {
        cpu = 500 # MHz
        memory = 128 # MB
      }
    }

Demo

For this demo I have taken the nodejs app that I built for the Consul demo and wrapped it into a simple container. I will be using Nomad to deploy the container onto the three VMs and having it register the app to Consul. This demo is starting from the state of the Consul cluster being up and running. Consul provides automatic clustering for Nomad making the entire startup process much easier and far less painful.

One thing to note about this deployment is that both the Nomad client and server will be running on the same VMs. In a production deployment this would not be the case as there could be a risk of resource exhaustion from the workloads causing the server to crash. Since this is a demo and a small deployment that is not a concern.

Installing Nomad

Installing Nomad is fairly simple since you can get pre-compiled binaries. For CentOS 7 I used the following commands to quickly and easily install Nomad. You will need to be sure to chown the data directory that is created to the user you are going to run Nomad as.

cd /usr/local/bin
sudo curl -o nomad.zip https://releases.hashicorp.com/nomad/1.0.4/nomad_1.0.4_linux_amd64.zip
sudo unzip nomad.zip
sudo rm nomad.zip
sudo mkdir -p /etc/nomad.d
sudo mkdir -p /opt/nomad/data

Once installed you can check that everything is working by running the Nomad command. For checking the version you should get an output like below.

Configuring Nomad

Nomad can be run from just command line arguments or a configuration file. I tend to use the configuration file as it simplifies things if I want to run Nomad as a service. There are a large number of options that can be passed to Nomad, but this demo will use a basic configuration.

Normally there would be a client.hcl and a server.hcl configuration file. For this demo, since I am running them together, there is a single config.hcl file.

datacenter = "lab"
data_dir = "/opt/nomad/data"
server {
  enabled = true
  bootstrap_expect = 3
}

client {
  enabled = true
}

The configuration file will go in the /etc/nomad.d/ directory.

Starting Nomad

Nomad has a very simple startup similar to Consul. You can start the agent using the following command.

nomad agent -config=/etc/nomad.d/config.hcl

Once running the first place to check that everything is up is in the Consul UI. From there you should see the new nomad and nomad-client services listed.

The next check is the Nomad UI. This can be reached on port 4646 on the ui endpoint for any of the Nomad server IPs.

Nomad UI

The Nomad UI provides a wealth of information along with the ability to manage jobs. On initial load the job screen is displayed. This will show all of the jobs that have been submitted. There is also a button to deploy or update jobs by uploading a job file. Along the side there are tabs to look into the clients, servers and topology.

The clients shows a list of all the available Nomad clients in the cluster.

You can then click on a specific client to see information about it including available drivers, storage volumes, and meta-data.

The servers page offers a similar view, but of the connected Nomad servers.

And in this case the individual information includes some basic information about the server.

Finally, the topology screen provides a visual view of where workloads are deployed and overall resource usage. This provides a great quick view of the entire ecosystem.

Job File

The job file used for this demo is fairly simple. It follows the example from Nomad for most of the options.

# This declares a job named "demo". There can be exactly one
# job declaration per job file.
job "demo" {
  datacenters = ["lab"]

  # Run this job as a "service" type. Each job type has different
  # properties. See the documentation below for more examples.
  type = "service"

  # Specify this job to have rolling updates, two-at-a-time, with
  # 30 second intervals.
  update {
    stagger      = "30s"
    max_parallel = 2
  }


  # A group defines a series of tasks that should be co-located
  # on the same client (host). All tasks within a group will be
  # placed on the same host.
  group "web" {
    # Specify the number of these tasks we want.
    count = 3

    network {
      # This requests a dynamic port named "http". This will
      # be something like "46283", but we refer to it via the
      # label "http".
      port "http" {}
    }

    # The service block tells Nomad how to register this service
    # with Consul for service discovery and monitoring.
    service {
      # This tells Consul to monitor the service on the port
      # labelled "http". Since Nomad allocates high dynamic port
      # numbers, we use labels to refer to them.
      port = "http"

      check {
        type     = "http"
        path     = "/"
        interval = "10s"
        timeout  = "2s"
      }
    }

    # Create an individual task (unit of work). This particular
    # task utilizes a Docker container to front a web application.
    task "app" {
      # Specify the driver to be "docker". Nomad supports
      # multiple drivers.
      driver = "docker"

      # Configuration is specific to each driver.
      config {
        image = "192.168.1.91:5000/web-test"
        ports = ["http"]
        args = [
            "${NOMAD_PORT_http}",
            "hello"
        ]
      }

      # Specify the maximum resources required to run the task,
      # include CPU and memory.
      resources {
        cpu    = 500 # MHz
        memory = 128 # MB
      }
    }
  }
}

To run the job it needs to be submitted to Nomad. This can be done using the UI.

Once the job file is loaded you can submit it to be planned. The plan step shows you the action Nomad is going to take to deploy the job.

Once you click on run Nomad will begin allocating the tasks based on the plan.

As the tasks start up they will move into the healthy state if all is well.

End Result

Now that the job is deployed we can look at how everything has been allocated. From the job view you can scroll down to active allocations.

By clicking on an allocation you can see information about the specific allocation of a task.

We can also see the allocation show up on the topology page.

And finally we can see the services for the various tasks registered in Consul.

Conclusion

As you can see Nomad can be a very powerful tool for deploying workloads. Not only is it easy to get up and running, it is also easy to roll out apps. While the demo above is for a docker container it could just as easily been a legacy java application. If you are in need of a tool for deploying services in a scalable and easy to update fashion I would recommend Nomad.

As a stand alone tool Nomad provides a great deal of functionality. When paired with Consul it makes building, configuring, and maintaining a micro-services architecture a breeze.