Commit Graph

17 Commits

Author SHA1 Message Date
Adam Williamson
2dbf99e280 openqa/worker: bump load average threshold for big worker hosts
This is a new feature in openQA that prevents worker hosts
picking up new jobs if their load average is above a certain
threshold. It defaults to 40. Our big worker hosts tend to run
above this, so let's bump it on those.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2024-08-30 23:27:48 -07:00
Adam Williamson
f1e0e0d037 Fix openqa_tap truthiness checks
Sigh, |bool doesn't do what you might think it does:
https://medium.com/opsops/wft-bool-filter-in-ansible-e7e2fd7a148f

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-11-25 14:58:36 -08:00
Adam Williamson
28110d34be openqa/worker: prepare to handle multiple tap worker classes
I'm going to try splitting the tap jobs across multiple worker
hosts. We have quite a lot of tap jobs, now, and I have seen
sometimes a situation where all non-tap jobs have been run and
the non-tap worker hosts are sitting idle, but the single tap
worker host has a long queue of tap jobs to get through.

We can't just put multiple hosts per instance into the tap
class, because then we might get a case where job A from a tap
group is run on one host and job B from a tap group is run on
a different host, and they can't communicate. It's actually
possible to set this up so it works, but it needs yet more
complex networking stuff I don't want to mess with. So instead
I'm just gonna split the tap job groups across two classes,
'tap' and 'tap2'. That way we can have one 'tap' worker host
and one 'tap2' worker host per instance and arch, and they will
each get about half the tap jobs.

Unfortunately since we only have one aarch64 worker for lab it
will still have to run all the jobs, but for all other cases we
do have at least two workers, so we can split the load.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-11-25 14:11:58 -08:00
Adam Williamson
95f062c07a openQA: allow all workers NFS write access, other tweaks
The main goal of these changes is to allow all workers in each
deployment NFS write access to the factory share. This is because
I want to try using os-autoinst's at-job-run-time decompression
of disk images instead of openQA's at-asset-download-time
decompression; it avoids some awkwardness with the asset file
name, and should also actually allow us to drop the decompression
code from openQA I think.

I also rejigged various other things at the same time as they
kinda logically go together. It's mostly cleanups and tweaks to
group variables. I tried to handle more things explicitly with
variables, as it's better for use of these plays outside of
Fedora infra.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-05 16:10:32 -08:00
Adam Williamson
6b196e70ab openqa/worker: set up swtpm service on tap worker hosts
swtpm is a TPM emulator we want to use for testing Clevis on
IoT (and potentially other things in future). We're implementing
this by having os-autoinst just add the qemu args but expect
swtpm itself to be running already - that's counted as the
sysadmin's responsibility. My approach to this is to have openQA
tap worker hosts also be tpm worker hosts, meaning they run one
instance of swtpm per worker instance (as a systemd service) and
are added to a 'tpm' worker class which tests can use to ensure
they run on a suitably-equipped worker. This sets up all of that.
We need a custom SELinux policy module to allow systemd to run
swtpm - this is blocked by default.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-06-24 16:59:11 -07:00
Adam Williamson
a6b9c5392d openqa/worker: disable aarch64-02 with a special worker class
openqa-aarch64-02.qa is broken in some very mysterious way:
https://pagure.io/fedora-infrastructure/issue/8750
until we can figure that out, this should prevent it picking up
normal jobs, but let us manually target a job at it whenever we
need to for debugging.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-04-24 21:34:26 +02:00
Adam Williamson
5759ecd1b6 openqa/worker: try and avoid failures in the NFS mount
I thought just having it WantedBy remote-fs.target should be
enough, but in fact this mount often fails on boot, and I forget
to check all the worker boxes until a bunch of tests fail and
everyone is sad. Let's try After=network-online.target and see
if that helps.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2019-06-15 09:11:40 -07:00
Paul Whalen
d05660192c Add aarch64 to workers template. 2018-03-06 09:47:01 +00:00
Adam Williamson
7404c551e3 openqa/worker: correct WORKER_CLASS for ppc64 tap workers 2017-08-21 20:48:29 -07:00
Adam Williamson
de64bbf198 openqa/worker: tap workers have default classes too
we don't want these workers to *only* run tap tests, so put the
default classes into their WORKER_CLASS too.
2016-05-05 14:47:56 -07:00
Adam Williamson
b0b7dc9b47 openqa/worker: give up on GRE, single tap host instead
OK, this GRE crap ain't working. Let's give up! Instead let's
have one tap-capable host per openQA deployment, so all the
tap jobs will go to it. This...should achieve that. Let's see
what blows up.
2016-05-05 14:10:46 -07:00
Adam Williamson
5ddbf54811 openqa/worker: oh ok, probably this
duh quotes are hard
2016-05-05 11:22:42 -07:00
Adam Williamson
4ec8d3f50a openqa/worker - okay maybe this? WHO KNOWS LET'S SEE
watch the pretty pretty fireworks
2016-05-05 11:18:21 -07:00
Adam Williamson
7a37862fbc openqa/worker: try setting up GRE tunnels between worker hosts
everyone stand back, this one's gonna go boom.
2016-05-05 10:32:57 -07:00
Adam Williamson
9ce401e74d use an ifup-pre-local for tap device creation
holy crap, this is some ancient magic.
2016-04-27 15:46:13 -07:00
Adam Williamson
48291f1640 openqa/worker: initial attempt at openvswitch config
this is highly experimental and for deployment only to stg at
present...I have this stuff working on happyassassin, now trying
to translate it to stg.
2016-04-27 13:32:56 -07:00
Adam Williamson
2b098b34bd set up for openQA deployment
This adds openQA server, worker and dispatcher roles, and
applies them to the appropriate hosts. A few secret vars are
required. See trac #4958 for discussion.
2015-11-13 09:49:00 -08:00