Files
fedora-infra_ansible/roles/openqa/worker/tasks/nfs-client.yml
Adam Williamson 666196bbed openqa/worker: don't start worker unless NFS mount is up
There's this annoying pattern where the NFS mount fails on boot
and then the worker services all start up and take jobs, but they
instafail because the share isn't there.

Ideally we could handle this very easily with Restart= directives
but systemd has...*opinions* about this:

https://github.com/systemd/systemd/issues/4468
https://github.com/systemd/systemd/issues/1312

so we have to do some fairly awkward hacks to just express:

* Retry the NFS mount if it fails
* Don't start the workers unless the NFS mount is up
* Retry the workers after a while if they were blocked

It's ugly, but in testing this same config on one worker it seems
to work...

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2025-07-10 19:07:54 -07:00

41 lines
1.4 KiB
YAML

# Required vars
# - openqa_hostname
## string - hostname of the openQA server (we assume it is hosting the NFS mount)
---
- name: Install NFS client
ansible.builtin.package: name=nfs-utils state=present
tags:
- packages
# We don't check ownership as, after mounting, it's owned by whatever the
# UID of geekotest is on the server
- name: Ensure mount target exists
ansible.builtin.file: path=/var/lib/openqa/share state=directory mode=0755
- name: Create mount service
ansible.builtin.template: src=var-lib-openqa-share.service.j2 dest=/etc/systemd/system/var-lib-openqa-share.service owner=root group=root mode=0644
register: sharemount
tags:
- config
- name: Remove old mount unit
ansible.builtin.file: path=/etc/systemd/system/var-lib-openqa-share.mount state=absent
- name: Ensure dropin directory exists
ansible.builtin.file: path=/etc/systemd/system/openqa-worker-plain@.service.d state=directory mode=0755
- name: Add dropin to make workers not start unless share is mounted
ansible.builtin.copy: src=owp-require-mount.conf dest=/etc/systemd/system/openqa-worker-plain@.service.d/owp-require-mount.conf
tags:
- config
- name: Enable and start mount
service: name={{ item }} enabled=yes state=started
with_items:
- var-lib-openqa-share.service
- remote-fs.target
- name: Restart mount if unit changed
systemd: name=var-lib-openqa-share.service state=restarted daemon_reload=yes
when: sharemount is changed