The background to this is
https://bugzilla.redhat.com/show_bug.cgi?id=2073414 , in response
to which git was changed to die if a user runs git commands
on a repo which it doesn't own. In openQA, the test directory
is a git repo and openQA itself likes to run git commands on it,
but this is often going to be as a different user than the owner
of the directory. In fact on the worker hosts, the user that owns
the directory (geekotest on the server box) doesn't even exist.
This just sets the config by copying a file in place rather than
running a git command (which is hard to get to be idempotent) and
uses `/etc/gitconfig` so we don't wind up with a file in the
_openqa-worker user's home directory, which is meant to be empty.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is just cleaning up the mess of the bad parameter from
earlier, run of this play broke halfway through, need to do the
remaining half without choking on this part.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This unifies prod and stg onto the ways of doing things for the
latest packages, and rejigs the swtpm stuff a bit to tear down
more (we shouldn't need the custom SELinux policy any more).
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This also tears down our swtpm systemd service setup, as
os-autoinst should now handle swtpm device setup for us.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
A recent test has a couple of perl deps, we need to ensure these
are installed on the worker hosts.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
The main goal of these changes is to allow all workers in each
deployment NFS write access to the factory share. This is because
I want to try using os-autoinst's at-job-run-time decompression
of disk images instead of openQA's at-asset-download-time
decompression; it avoids some awkwardness with the asset file
name, and should also actually allow us to drop the decompression
code from openQA I think.
I also rejigged various other things at the same time as they
kinda logically go together. It's mostly cleanups and tweaks to
group variables. I tried to handle more things explicitly with
variables, as it's better for use of these plays outside of
Fedora infra.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
On client end, restart mount unit (with daemon-reload) if mount
file changes. On server end, run exportfs -r if export config
file changes.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This is because swtpm is designed not to be persistent, it's
sort of tied to a single "system" (VM in this case). We can't
expect an instance will stick around after it's been "used", it
doesn't do that, it exits successfully. So we need to restart it
when that happens.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Using `when` with `import_tasks` doesn't actually skip the import
entirely, it just imports the tasks and skips them one by one.
Which reads oddly. `include_tasks` is properly dynamic so seems
better here.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
swtpm is a TPM emulator we want to use for testing Clevis on
IoT (and potentially other things in future). We're implementing
this by having os-autoinst just add the qemu args but expect
swtpm itself to be running already - that's counted as the
sysadmin's responsibility. My approach to this is to have openQA
tap worker hosts also be tpm worker hosts, meaning they run one
instance of swtpm per worker instance (as a systemd service) and
are added to a 'tpm' worker class which tests can use to ensure
they run on a suitably-equipped worker. This sets up all of that.
We need a custom SELinux policy module to allow systemd to run
swtpm - this is blocked by default.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It's failing on the new IAD worker and I can't figure out why.
Let's skip it for now just to get the plays run.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It shouldn't need anything but 10.0.2.*, and hopefully this will
stop it interfering with the rest of the infra network...
Signed-off-by: Adam Williamson <awilliam@redhat.com>
So we can test non-master branches on stg easier. May extend this
design to other repos (like the tests...) later.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
Scratch builds are installed now and seem to be working, so on
their way to updates-testing, so we don't need to specify them
here any more. Also drop the hack I put in to get the service
restart handler run.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
It's failing and I don't see why, since I based this right on the
ansible docs. Maybe a |int will help?
Signed-off-by: Adam Williamson <awilliam@redhat.com>
This provides a mechanism for deploying scratch builds, and also
for controlling whether or not to install openQA and os-autoinst
from updates-testing.
I have been doing the scratch build thing for years already, just
manually by ssh'ing into the boxes. This is getting tiring now
we have like 15 worker hosts.
The scratch build mechanism isn't properly idempotent, but fixing
that would be hard and I really only intend to use it transiently
when I'm updating the packages, so I don't think it's worth the
effort.
This also adds a notification for restarting openQA worker
services when the packages or config are updated, and fixes the
worker playbook to enable the last worker service.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
openqa-aarch64-02.qa is broken in some very mysterious way:
https://pagure.io/fedora-infrastructure/issue/8750
until we can figure that out, this should prevent it picking up
normal jobs, but let us manually target a job at it whenever we
need to for debugging.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
I thought just having it WantedBy remote-fs.target should be
enough, but in fact this mount often fails on boot, and I forget
to check all the worker boxes until a bunch of tests fail and
everyone is sad. Let's try After=network-online.target and see
if that helps.
Signed-off-by: Adam Williamson <awilliam@redhat.com>
In ansible 2.8 the - character isn't supposed to be valid in group names.
While we could override this, might has well just bite the bullet and change it.
So, just switch all group names to use _ instead of -
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
Hopefully this bit in tap-setup.yml can now go away, as this
approach of using ansible_ifcfg_whitelist and _disabled does the
same thing in a cleaner way.
Signed-off-by: Adam Williamson <awilliam@redhat.com>