…or rather: How I choose to backup databases when using Nomad.
When I was researching backup options after switching to Nomad, I considered using something like docker-db-backup. I quickly realized one downside of having to remember to align postgres-client (backup container) with the version of the server (database container). And as I was running at that time five different databases (Postgres/MySQL) it was a deal-breaker for me.
After more reading, I have decided to write a bash script that would be using Nomad’s raw_exec and cron capabilities.
Leveraging Consul will help obtain the allocation id of the task we’re interested in. Then we can execute nomad alloc exec to call pg_dump within a database docker container.
Then it’s up to us what to do with that dump - I have decided to pipe the output to docker again by using s3cmd docker image to put it on the S3 bucket (actually a Minio bucket). Note: I recommend using a backup location outside your data center as a good practice. I was using Minio as a training exercise.
Preparing the ACL token for the script
You can skip this part if you’re not using Nomad’s ACL capabilities.
# nomad-exec-policy.hcl
namespace "default" {
policy = "write"
capabilities = ["alloc-exec"]
}
Create a new policy using the file above:
nomad acl policy apply -description "Nomad exec policy" nomad-exec nomad-exec-policy.hcl
Create new token - Secret ID is the one you will need:
nomad acl token create --global -name="Nomad exec token" -policy=nomad-exec -type=client
Accessor ID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx
Secret ID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx
Nomad database job and backup job
Prerequisites: I’m using Consul for service discovery and Vault for fetching passwords, but your mileage may vary here.
After you enabled raw_exec on the client, you should be good to create a new backup job.
I’m using Vault to obtain DB’s credentials for the backup and S3 credentials for s3cmd. S3cmd will send a backup to Minio exposed somewhere within the private network on 9000 port.
To make it all work together, we need a database task that exposes its allocation id (Nomad allocation id). We can register service in Consul and use the tags feature to do that.
To give you a better picture here is the database job (slimmed-down version, removed irrelevant definitions)
job "db-task" {
datacenters = ["dc1"]
type = "service"
vault {
policies = ["nomad-read"]
}
group "db-task" {
network {
port "db" {
to = 5432
# I'm using internal network called 'private'
host_network = "private"
}
}
task "db-task" {
driver = "docker"
config {
# ommiting volume mount here for brievity
image = "postgres:14.0-alpine3.14"
ports = ["db"]
}
template {
data = <<EOH
{{- with secret "kv-v1/nomad/db/postgres" -}}
POSTGRES_PASSWORD="{{ .Data.password }}"
POSTGRES_USER="{{ .Data.user }}"
POSTGRES_DB="{{ .Data.db }}"
{{- end -}}
EOH
destination = "secrets/file.env"
env = true
}
resources {
cpu = 200
memory = 200
memory_max = 300
}
service {
name = "db-task"
port = "db"
# backup service will rely on that particular 'alloc' tag
tags = ["alloc=${NOMAD_ALLOC_ID}"]
check {
type = "tcp"
interval = "10s"
timeout = "2s"
}
}
}
}
}
And here is - completely separated - backup job:
job "db-backup" {
datacenters = ["dc1"]
type = "batch"
vault {
policies = ["nomad-read"]
}
periodic {
cron = "0 22 * * * *"
prohibit_overlap = true
}
group "db-backup" {
task "postgres-backup" {
driver = "raw_exec"
config {
command = "/bin/bash"
args = ["local/script.sh"]
}
template {
data = <<EOH
set -e
nomad alloc exec -task db-task $DB_ALLOC_ID \
bin/bash -c "PGPASSWORD=$PGPASSWORD PGUSER=$PGUSER PGDATABASE=$PGDATABASE pg_dump --compress=4 -v" | \
docker run -i --rm \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
d3fk/s3cmd:stable \
--host=$S3_HOST_BASE \
--no-ssl \
--host-bucket=$S3_HOST_BASE -v \
put - s3://$S3_BUCKET/$(date "+%Y-%m-%d---%H-%M-%S").dump.gz
EOH
destination = "local/script.sh"
}
template {
data = <<EOH
{{- with secret "kv-v1/nomad/db/postgres" -}}
PGPASSWORD="{{ .Data.password }}"
PGUSER="{{ .Data.user }}"
PGDATABASE="{{ .Data.db }}"
{{ end }}
{{- with secret "kv-v1/nomad/s3/backup" -}}
AWS_ACCESS_KEY_ID="{{ .Data.access_key_id }}"
AWS_SECRET_ACCESS_KEY="{{ .Data.secret_access_key }}"
# here you also might want to set NOMAD_TOKEN env
# if you're using ACL capabilities
{{ end }}
# as service 'db-task' is registered in Consul
# we wat to grab its 'alloc' tag
{{- range $tag, $services := service "db-task" | byTag -}}
{{if $tag | contains "alloc"}}
{{$allocId := index ($tag | split "=") 1}}
DB_ALLOC_ID="{{ $allocId }}"
{{end}}
{{end}}
# relying on service DNS discovery provided by Consul
# to obtain Minio IP address
S3_HOST_BASE=minio.service.consul:9000
S3_BUCKET=my-bucket-name
EOH
destination = "secrets/file.env"
env = true
}
resources {
cpu = 200
memory = 200
memory_max = 300
}
}
}
}
I like this approach as not much magic is going on here - we’re simply calling a plain bash script and piping output from a running docker container to another docker container. As long as there are no breaking changes in pg_dump, we can forget about the backup job - it should just work.