Commit Graph

376 Commits

Author SHA1 Message Date
Adam Williamson
de979123fa openQA: don't install the fedoraupdaterestart plugin any more
We don't need it, we use upstream RETRY now.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-12-19 16:16:11 -08:00
Adam Williamson
f1e0e0d037 Fix openqa_tap truthiness checks
Sigh, |bool doesn't do what you might think it does:
https://medium.com/opsops/wft-bool-filter-in-ansible-e7e2fd7a148f

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-11-25 14:58:36 -08:00
Adam Williamson
28110d34be openqa/worker: prepare to handle multiple tap worker classes
I'm going to try splitting the tap jobs across multiple worker
hosts. We have quite a lot of tap jobs, now, and I have seen
sometimes a situation where all non-tap jobs have been run and
the non-tap worker hosts are sitting idle, but the single tap
worker host has a long queue of tap jobs to get through.

We can't just put multiple hosts per instance into the tap
class, because then we might get a case where job A from a tap
group is run on one host and job B from a tap group is run on
a different host, and they can't communicate. It's actually
possible to set this up so it works, but it needs yet more
complex networking stuff I don't want to mess with. So instead
I'm just gonna split the tap job groups across two classes,
'tap' and 'tap2'. That way we can have one 'tap' worker host
and one 'tap2' worker host per instance and arch, and they will
each get about half the tap jobs.

Unfortunately since we only have one aarch64 worker for lab it
will still have to run all the jobs, but for all other cases we
do have at least two workers, so we can split the load.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-11-25 14:11:58 -08:00
Adam Williamson
1c95ec9a35 Revert "openQA: set higher LimitRequestLine in httpd vhost config"
This reverts commit 892453da7e.
openQA still had problems with the very long request, so I just
did an ugly hack to get the request under the limit instead.
2022-10-21 17:12:15 -07:00
Adam Williamson
892453da7e openQA: set higher LimitRequestLine in httpd vhost config
The openQA job scheduler was hitting 414 errors today because
an update has so many builds there are more than 8190 characters
(the default limit) in the POST request. Let's bump the limit
to 16000.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-10-21 08:38:05 -07:00
Adam Williamson
f122367c34 openqa/worker: change name on kernel override config file
It really needs to be called exactly 60-block-scheduler.rules
as it's overriding a file of the same name in `/usr`.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-08-17 14:28:18 -04:00
Adam Williamson
ca8e3db401 Add file missed from previous commit
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-08-17 13:46:16 -04:00
Adam Williamson
5b49611201 openqa/worker: override kernel scheduler config
This applies only within Fedora infra for now, as we're not sure
whether worker hosts on different hardware hit this bug. It's
intended to work around:
https://bugzilla.redhat.com/show_bug.cgi?id=2009585
a bug which results in the infra worker hosts hanging after a
short time when running kernels newer than 5.11.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-08-17 13:34:15 -04:00
Adam Williamson
8e891fe4d5 openqa/server: update for git default branch rename
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-07-14 11:58:09 -07:00
Adam Williamson
7ba67fdc12 openQA: don't enable FedoraUpdateRestart plugin
Upstream implemented a feature that we can use to do the same
thing using just a test variable, so we're switching to that.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-07-06 10:42:26 -07:00
Adam Williamson
a91dfc29e9 openqa: twiddle with the delegation stuff again
Ugh, we delegate for the assetsize stuff too and there's tons of
that, splitting it would be awful. Let's try a different approach
with a new optional variable for the delegate target.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-06-07 16:32:04 -07:00
Adam Williamson
42e930e97f openqa-onebox: tweak db host stuff
Using the machine's own hostname works for the ansible delegate
stuff but doesn't work for openQA itself (if you try and access
the DB by hostname like this, postgres denies access; you have
to use 'localhost' for postgres to allow it). Using 'localhost'
works for postgres but doesn't do the right thing for delegation.
Let's use 'localhost' and split the two play steps into
delegated and non-delegated versions.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-06-07 16:17:29 -07:00
Adam Williamson
d227dac859 openqa/dispatcher: don't require resultsdb and wiki URL/hostname
The config file should treat these as optional, not every openQA
instance wants to report results.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-06-07 15:21:15 -07:00
Adam Williamson
4fd83483fa openqa/dispatcher: comment on the needed packages
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-06-07 15:21:15 -07:00
Adam Williamson
ccf3b23cd4 openqa/server: skip openqa.ini amqp section if vars not set
We don't want to include this section if the vars aren't set.
Not every openQA server has to be an AMQP publisher.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-06-07 15:21:15 -07:00
Adam Williamson
6c2991306c openqa/server: only install nfs-utils when needed
If there are no NFS workers, we don't need the NFS server.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-06-07 15:21:15 -07:00
Adam Williamson
0cf8a59fd5 openqa: fix openqa_nfs_{worker,client}s confusion again
Missed from previous commit.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-06-07 13:26:22 -07:00
Adam Williamson
e5c5cc336f openqa: fix confusion between openqa_nfs_{worker,client}s
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-06-07 13:26:01 -07:00
Adam Williamson
bf4f704096 openqa: improve how we do the git config thing
The background to this is
https://bugzilla.redhat.com/show_bug.cgi?id=2073414 , in response
to which git was changed to die if a user runs git commands
on a repo which it doesn't own. In openQA, the test directory
is a git repo and openQA itself likes to run git commands on it,
but this is often going to be as a different user than the owner
of the directory. In fact on the worker hosts, the user that owns
the directory (geekotest on the server box) doesn't even exist.

This just sets the config by copying a file in place rather than
running a git command (which is hard to get to be idempotent) and
uses `/etc/gitconfig` so we don't wind up with a file in the
_openqa-worker user's home directory, which is meant to be empty.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-05-27 10:24:34 -07:00
Adam Williamson
3d148f5e7f openqa/worker: handle git 'safety' check for test dir
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-05-27 09:05:06 -07:00
Adam Williamson
f869c0f643 Revert "openqa/worker: handle git 'safety' check for test dir"
This reverts commit 34b3d3a5cc. On
second thoughts it's kinda ugly and I need to think about other
options...
2022-05-26 15:23:13 -07:00
Adam Williamson
34b3d3a5cc openqa/worker: handle git 'safety' check for test dir
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-05-26 15:05:00 -07:00
Adam Williamson
b5be505576 openqa/server: don't hide ISO assets any more
We were hiding these because in the past the only ISO assets
were those from the compose under test, and we wanted to avoid
people downloading them from openQA when we'd rather they get
them from dl.fp.o or the mirror system. But these days we have
tests that generate ISOs (update netinst and live image build
tests) and we often want to download the generated images to
test them locally.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-05-25 09:12:10 -07:00
Adam Williamson
e6e0e2f42d openqa: set up for new resultsdb location and auth on lab
This sets up the openQA lab instance to report to the new stg
instance of resultsdb, and use authentication. The scheduler
config file is now mode 0600 because it has a password in it.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-05-11 17:06:35 -07:00
Adam Williamson
58dd80c799 openqa/server: reduce PPC update group asset size
We need to treat it and the x86_64 update group separately to
do this, but it really doesn't need 200G. We have images from
three weeks ago, and we don't need that kind of buffer, and space
is a bit tight.

Note: there is no aarch64 updates group as we do not currently
run updates tests on aarch64.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2022-03-22 16:17:17 -07:00
Adam Williamson
3dec01a15a openqa/server: set httpd_can_network_connect boolean again :(
Seems there's one more port that needs to be tagged before we
can finally unset this:
https://bugzilla.redhat.com/show_bug.cgi?id=1277312#c9

Keep the custom policy as well, though, so we just need to
update it when that port gets done.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-14 16:33:19 -08:00
Adam Williamson
2320eef5ee openqa/worker: create custom SELinux module directory first
Whoops. Also order these things a bit better.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-14 15:54:38 -08:00
Adam Williamson
edc4caa833 openqa/server: use custom SELinux policy instead of boolean
We've been using the httpd_can_network_connect boolean for years
to allow httpd to connect to the openQA server processes. This
is an unnecessarily large hammer when we only need it to be
able to connect to exactly the two openQA ports. This uses a
custom SELinux policy to allow connecting to those ports only,
and ensures the boolean is set back to off.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-14 15:48:34 -08:00
Adam Williamson
67eb9bb288 openqa/server: clean up and trim package requirements
Several of these requirements are old ones that were only needed
for createhdds, when we ran createhdds on the servers. All of
those can go. Also make the list line-by-line for easier git
blame tracking in future (and add comments for the remaining
entries so we know why they're there).

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-14 14:43:29 -08:00
Adam Williamson
38888162ea openQA: remove swtpm-teardown now the work is done
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-06 14:18:46 -08:00
Adam Williamson
7a5d7f59fb openQA: Drop already-done step from swtpm-teardown
This is just cleaning up the mess of the bad parameter from
earlier, run of this play broke halfway through, need to do the
remaining half without choking on this part.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-06 14:12:43 -08:00
Adam Williamson
ca2684c711 openQA: fix stupid semodule argument
gah.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-06 14:05:14 -08:00
Adam Williamson
224e28131d openQA: prepare for prod deployment of latest releases
This unifies prod and stg onto the ways of doing things for the
latest packages, and rejigs the swtpm stuff a bit to tear down
more (we shouldn't need the custom SELinux policy any more).

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-12-06 10:40:33 -08:00
Adam Williamson
55be7c05f6 openQA: update AMQP config settings for lab
These need to change with the newer version of openQA.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-11-30 10:30:20 -08:00
Adam Williamson
5889d3a9ae openQA: untag the swtpm-teardown task for stg now it's run
Keeping it around to run on prod when needed, then we'll take it
out.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-11-26 14:03:33 -08:00
Adam Williamson
7f3f19035f openQA: test new os-autoinst scratch build on lab
This also tears down our swtpm systemd service setup, as
os-autoinst should now handle swtpm device setup for us.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-11-26 12:34:41 -08:00
Adam Williamson
3e4c3534e5 openqa: switch FCOS scheduling to messages, reduce duplication
This sets us up for scheduling FCOS tests from messages, not
using a cron job. Also reduces some duplication of variables
between openqa-servers-common and the dispatcher role defaults.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-11-24 10:59:01 -08:00
Adam Williamson
fc3f87b646 Revert "openQA: deploy new qemu build with qxl snapshot fix"
This reverts commit 92e66bb444 and
follow-up commits. We don't need it now we're back on virtio
graphics.
2021-11-12 15:40:00 -08:00
Adam Williamson
d00fdb03eb openqa/worker: install latest qemu-common
to make the last change actually work. temporary change.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-11-08 12:21:21 -08:00
Adam Williamson
faa8f6c27b openqa/worker: install packages used by tests
A recent test has a couple of perl deps, we need to ensure these
are installed on the worker hosts.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-06-18 08:56:17 -07:00
Adam Williamson
ca112a1922 openQA: update some branch names to 'main' not 'master'
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-06-01 13:39:02 -07:00
Adam Williamson
61af6f34ca openQA: update server config (disable audit, tweak cleanup)
We never use the auditing stuff, so let's turn it off (and set
short limits for audit event duration so we can run the cleanup
and get rid of existing audit events). Let's also use the new
setting that only runs asset cleanup if free space is low.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2021-04-08 09:24:23 -07:00
Adam Williamson
aa2a002a96 Change how we get the HTML file accessible in fedora_nightlies
Just can't get Apache config Alias to work for some reason, so
let's go with the flow and stick the file in openQA's public
directory. This works!

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-21 18:37:03 -08:00
Adam Williamson
efb353bc02 Let's make that IncludeOptional so lab doesn't die
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-21 17:47:23 -08:00
Adam Williamson
4851dc8d65 Try and do fedora_nightlies Apache config without breaking openQA
Er, oops. This involves a hack, but at least it doesn't take the
openQA web UI offline.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-21 17:43:55 -08:00
Adam Williamson
813bbc4d2a openqa/server: allow group to write to factory dirs
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-05 17:16:28 -08:00
Adam Williamson
61251d0b11 More syntax...sigh
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-05 16:24:27 -08:00
Adam Williamson
95f062c07a openQA: allow all workers NFS write access, other tweaks
The main goal of these changes is to allow all workers in each
deployment NFS write access to the factory share. This is because
I want to try using os-autoinst's at-job-run-time decompression
of disk images instead of openQA's at-asset-download-time
decompression; it avoids some awkwardness with the asset file
name, and should also actually allow us to drop the decompression
code from openQA I think.

I also rejigged various other things at the same time as they
kinda logically go together. It's mostly cleanups and tweaks to
group variables. I tried to handle more things explicitly with
variables, as it's better for use of these plays outside of
Fedora infra.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-11-05 16:10:32 -08:00
Adam Williamson
be8dc36f7f openqa/worker: sigh restarted not started
Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-30 14:36:12 -07:00
Adam Williamson
c2023d5560 openQA: try to make NFS mount changes more robust
On client end, restart mount unit (with daemon-reload) if mount
file changes. On server end, run exportfs -r if export config
file changes.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2020-10-30 14:06:07 -07:00