Collecting Hardware Metrics with Nomad

  • Configure Nomad Agents to supply Prometheus-compatible data
  • Configure Prometheus to access Nomad metrics endpoints
  • Configure Grafana to visualize Prometheus data

Configure Nomad

telemetry {
collection_interval = "1s",
prometheus_metrics = true,
publish_allocation_metrics = true,
publish_node_metrics = true
}
curl http://192.168.2.201:4646/v1/metrics?format=prometheus# HELP nomad_client_unallocated_cpu nomad_client_unallocated_cpu
# TYPE nomad_client_unallocated_cpu gauge
nomad_client_unallocated_cpu{datacenter="infrastructure_at_home",node_class="none",node_id="c2a79d23-9851-fd9b-dece-f86908d58d3c"} 5600
# HELP nomad_client_unallocated_disk nomad_client_unallocated_disk
# TYPE nomad_client_unallocated_disk gauge
nomad_client_unallocated_disk{datacenter="infrastructure_at_home",node_class="none",node_id="c2a79d23-9851-fd9b-dece-f86908d58d3c"} 26598
# HELP nomad_client_unallocated_memory nomad_client_unallocated_memory
# TYPE nomad_client_unallocated_memory gauge
nomad_client_unallocated_memory{datacenter="infrastructure_at_home",node_class="none",node_id="c2a79d23-9851-fd9b-dece-f86908d58d3c"} 926

Configure Prometheus

- job_name: nomad
metrics_path: "/v1/metrics"
params:
format: ['prometheus']
scheme: http
static_configs:
- targets:
- raspi-3-1.node.consul:4646
- raspi-3-2.node.consul:4646
- raspi-4-1.node.consul:4646
- raspi-4-2.node.consul:4646

Running Prometheus and Grafana as Docker Containers

task "prometheus" {
driver = "docker"
config {
image = "prom/prometheus:latest"
volumes = [
"local/prometheus.yml:/etc/prometheus/prometheus.yml",
]
mounts = [
{
type = "volume"
target = "/prometehus"
source = "prometheus"
}
]
port_map {
prometheus_ui = 9090
}
}
...
}
task "graphana" {
driver = "docker"
config {
image = "grafana/grafana:6.6.2"
mounts = [
{
type = "volume"
target = "/var/lib/grafana"
source = "graphana"
}
]
port_map {
http = 3000
}
}
...
}

Executing

>>nomad job run -address=http://192.168.2.204:4646 jobs/prometheus_docker.job.nomad==> Monitoring evaluation "ad5a5599"
Evaluation triggered by job "prometheus"
Evaluation within deployment: "158c6e72"
Allocation "1bf7bfd2" created: node "7be9e3ea", group "monitoring"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "ad5a5599" finished with status "complete"

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store