fedora-infra_ansible

mirror of https://pagure.io/fedora-infra/ansible.git synced 2026-03-20 03:57:02 +08:00

Author	SHA1	Message	Date
Adam Williamson	ae2cd3530b	roles/openqa/server: drop OpenID auth support We've been using OAuth2 for prod and stg for some time now, so let's clean this up. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2026-01-26 15:40:50 -08:00
Adam Williamson	4e4a12f2c3	roles/openqa/server: show more builds on the front page We only have two job groups, so the front page is a bit sad and empty. Let's show 10 builds per group, not 3. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2026-01-26 15:37:24 -08:00
Adam Williamson	804efd40d1	Update pagure.io/fedora-qa to forge.fedoraproject.org/quality Quality org has completed moving repos to Forgejo (all but one), so let's update all of these. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2026-01-23 15:31:46 -08:00
Adam Williamson	021c63e9df	Update some Forgejo-migrated repo URLs Signed-off-by: Adam Williamson <awilliam@redhat.com>	2026-01-09 18:51:10 -08:00
Adam Williamson	bd8ca3c2ca	openqa/worker: drop the whole 'no-ffmpeg-on-aarch64' thing it never worked anyway (ffmpeg always showed up, somehow) and on the new workers it doesn't seem to be an issue. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-11-17 15:31:53 -08:00
Adam Williamson	e1d8d18c3e	openqa/server: update and correct aarch64 update asset quota We weren't actually applying the quota we had defined (only) for lab, so both lab and prod had the upstream default (which now seems to be 200GB, not 100GB). Let's fix it so we do apply the value, and set it to 250GB for both prod and stg, because we're now aiming to have full parity in the update test sets between aarch64 and x86_64, and we have the space on the rdu3 hosts. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-09-03 14:19:23 -07:00
Adam Williamson	7387394898	openqa/worker: fix nfs hostname d'oh. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-07-21 10:14:34 -07:00
Adam Williamson	666196bbed	openqa/worker: don't start worker unless NFS mount is up There's this annoying pattern where the NFS mount fails on boot and then the worker services all start up and take jobs, but they instafail because the share isn't there. Ideally we could handle this very easily with Restart= directives but systemd has...opinions about this: https://github.com/systemd/systemd/issues/4468 https://github.com/systemd/systemd/issues/1312 so we have to do some fairly awkward hacks to just express: * Retry the NFS mount if it fails * Don't start the workers unless the NFS mount is up * Retry the workers after a while if they were blocked It's ugly, but in testing this same config on one worker it seems to work... Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-07-10 19:07:54 -07:00
Adam Williamson	099406f1b9	openqa/worker tap: set CAP_NET_ADMIN on qemu I have no idea why we didn't need this before, but we seem to need it now. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-07-02 20:08:18 -07:00
Adam Williamson	9d931214ea	Revert "openQA: rename openvswitch bridge device to avoid conflict" This reverts commit `4dc01bc892` and a follow-up commit. I'm having trouble getting things to work and want to see if it works if we go back to having the openQA bridge be br0, and rename the bridge used for the system's bonded network connection to something else instead.	2025-07-02 17:25:18 -07:00
Adam Williamson	10b68ac01f	openqa/worker: remove old unused files Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-07-02 17:23:42 -07:00
Adam Williamson	b343d8de52	Try and fix openQA bridge config Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-07-02 16:20:39 -07:00
Adam Williamson	f7ca68a38e	openqa/dispatcher: install resultsdb_conventions I thought/assumed/knew/something? that resultsdb_conventions_fedora required resultsdb_conventions, but right now it seems it doesn't. It should, but I can't fix it right now as the buildsystem is down, so let's just install it here... Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-07-02 15:20:09 -07:00
Adam Williamson	6606be7010	openqa tap: tell os-autoinst-openvswitch to use the right bridge Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-07-02 08:37:35 -07:00
Adam Williamson	69c9099615	openqa/worker tap: 'nogroup' is no more, must use 'nobody' Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-07-01 17:40:25 -07:00
Adam Williamson	4dc01bc892	openQA: rename openvswitch bridge device to avoid conflict On the new rdu3 worker hosts, br0 already exists and is the main system 'interface' (it's a bridge on two bonded physical interfaces connected to different switches, to make networking upgrades easier). So we can't call our openvswitch bridge 'br0' any more. Let's try calling it 'openqabr0' and see if anything explodes. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-06-30 16:14:01 -07:00
Aurélien Bompard	8248acbe1e	RabbitMQ: Drop the zmq.topic exchange, fedmsg has been retired Signed-off-by: Aurélien Bompard <aurelien@bompard.org>	2025-06-26 10:58:07 +02:00
Adam Williamson	5561372a5f	openqa/worker: drop edk2-arm package install It no longer exists and we're no longer doing 32-bit ARM tests. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-05-07 10:39:33 -07:00
Adam Williamson	5da2faac67	openqa/server: allow OAuth2 authentication, enable on lab OpenID support in FAS is going away. openQA has OAuth2 support. I've tested this config to work with manual edits on lab, now ansiblizing it (for lab only to start with). Signed-off-by: Adam Williamson <awilliam@redhat.com>	2025-03-28 13:40:57 -07:00
Michal Konecny	2ec055db6f	Use first uppercase letter for all handlers This will unify all the handlers to use first uppercase letter for ansible-lint to stop complaining. I went through all `notify:` occurrences and fixed them by running ``` set TEXT "text_to_replace"; set REPLACEMENT "replacement_text"; git grep -rlz "$TEXT" . \| xargs -0 sed -i "s/$TEXT/$REPLACEMENT/g" ``` Then I went through all the changes and removed the ones that wasn't expected to be changed. Fixes https://pagure.io/fedora-infrastructure/issue/12391 Signed-off-by: Michal Konecny <mkonecny@redhat.com>	2025-02-10 20:31:49 +00:00
Kevin Fenzi	6c38d7b61a	various: fix some more shell variables that were accidentally converted to builtin.shell Signed-off-by: Kevin Fenzi <kevin@scrye.com>	2025-01-15 17:26:50 -08:00
Ryan Lerch	47c68f478d	ansiblelint fixes - fqcn[action-core] - template to ansible.builtin.template Replaces references to template: with ansible.builtin.template Signed-off-by: Ryan Lerch <rlerch@redhat.com>	2025-01-15 11:30:29 +10:00
Ryan Lerch	3c41882bb0	ansiblelint fixes - fqcn[action-core] - shell to ansible.builtin.shell Replaces references to shell: with ansible.builtin.shell Signed-off-by: Ryan Lerch <rlerch@redhat.com>	2025-01-15 11:29:10 +10:00
Ryan Lerch	25391e95b7	ansiblelint fixes - fqcn[action-core] - package to ansible.builtin.package Replaces many references to package: with ansible.builtin.package Signed-off-by: Ryan Lerch <rlerch@redhat.com>	2025-01-15 11:28:00 +10:00
Ryan Lerch	462176464b	ansiblelint fixes-- fqcn[action-core] - command to ansible.builtin.command Replaces many references to command: with ansible.builtin.command Signed-off-by: Ryan Lerch <rlerch@redhat.com>	2025-01-15 11:26:47 +10:00
Ryan Lerch	6a3816dfdc	ansiblelint fixes-- fqcn[action-core] - copy to ansible.builtin.copy Replaces many references to 'copy' with ansible.builtin.copy Signed-off-by: Ryan Lerch <rlerch@redhat.com>	2025-01-15 10:43:31 +10:00
Ryan Lerch	62952df107	ansiblelint fixes-- fqcn[action-core] - file to ansible.builtin.file Replaces many references to file: with ansible.builtin.file Signed-off-by: Ryan Lerch <rlerch@redhat.com>	2025-01-15 10:41:52 +10:00
Ryan Lerch	691adee6ee	Fix name[casing] ansible-lint issues fix 1900 failures of the following case issue: `name[casing]: All names should start with an uppercase letter.` Signed-off-by: Ryan Lerch <rlerch@redhat.com>	2025-01-14 20:20:07 +10:00
Ryan Lerch	89f6f1fc32	Fix majority of remaining yamllint warnings and errors Signed-off-by: Ryan Lerch <rlerch@redhat.com>	2024-11-28 17:31:45 +10:00
Adam Williamson	4d801444a9	openqa: set up a side repo for prod as well as lab Sometimes we want to deploy something to prod before it goes stable (or even to u-t). Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-11-25 17:06:34 -08:00
Adam Williamson	1a537f38ce	openqa/server: correct scratchrepo removal d'oh. this has been broken for some time... Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-11-20 17:27:57 -08:00
Adam Williamson	cb026b4120	openqa/worker: kill stuck qemu processes daily This is an awful hack to deal with https://github.com/os-autoinst/os-autoinst/issues/2549 while we try and fix it properly. This finds stuck qemu processes by parsing the journal messages of the workers, and kills them. workers stuck in the broken state should then recover on the next checkin with the server. I tested this manually on all the worker hosts and it...seemed to work, mostly. I'll keep an eye on things after deploying it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-10-15 13:13:42 -07:00
Adam Williamson	1325a7ab15	adamverse: add --no-deps to pip install commands In various roles I maintain I use `python3 -m pip install` to directly install a Python project (usually a fedora-messaging consumer), to avoid the pointless bureaucracy of packaging them. The roles install all the deps of these projects as packages first, so pip doesn't have to install any deps, it only installs the project itself. Well...that's the idea. It's possible for this to go wrong (say I forget to update the roles when adding a dep to the project), and in that case I think we'd rather have things blow up (so I know something's wrong) than have pip silently install some random upstream wheel system-wide to make it work. The intent is that all the deps still come from proper Fedora packages, only these projects themselves get installed directly. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-09-20 11:10:09 -07:00
Adam Williamson	2dbf99e280	openqa/worker: bump load average threshold for big worker hosts This is a new feature in openQA that prevents worker hosts picking up new jobs if their load average is above a certain threshold. It defaults to 40. Our big worker hosts tend to run above this, so let's bump it on those. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-08-30 23:27:48 -07:00
Adam Williamson	4743c3fdce	openqa/worker: transition all tap workers to NM-based setup This seems to be working fine in testing, so let's deploy it everywhere. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-07-25 14:54:03 -07:00
Adam Williamson	762b23ef7d	openqa/worker tap-setup-nm: tweak some quoting, drop tunctl Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-07-25 14:06:52 -07:00
Adam Williamson	690a5eb951	openqa/worker: add NM-based tap setup and test on p09-worker01 network-scripts-openvswitch was removed in f40 and network-scripts is going away in f41; we really need to get off using them. This attempts to implement the same setup using NetworkManager, based on a few different NM/ovs references, and the source of openQA upstream's os-autoinst-setup-multi-machine . It might need a bit of tweaking, so for now, we make it a separate task and use it only on p09-worker01 for testing. This doesn't handle tearing down the old network-scripts-based config as that's pretty complex and will only need to happen once; I'll do it manually before trying this out. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-07-25 13:50:39 -07:00
Adam Williamson	7f73f1253e	openqa/worker: don't explicitly pull in ffmpeg-free on aarch64 We don't want it there - see earlier commits - but I didn't notice it's actually explicitly listed here for all arches, which breaks stuff on aarch64 now we told dnf to exclude it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-06-11 11:08:36 -07:00
Adam Williamson	44c5c79ad7	openqa/dispatcher: add a cron job to send missed test results This works around an annoying problem where, for some reason, we sometimes just miss sending completed test results to resultsdb. I've never been able to figure out why this happens, but this should band-aid it by looking, daily, for updates stuck in waiting gating status, checking for cases where a test finished but we didn't send a result, and sending it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-05-27 15:25:25 -07:00
Adam Williamson	826fd32330	openqa/worker: don't force createhdds off non-standard branch Using the same approach as we do for the tests and fedora_openqa. I wish I'd done this before I ran the playbook on lab and it wiped every...single...goddamn...disk image. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-03-26 15:35:33 -07:00
Adam Williamson	7cfe3d61e6	openqa/worker: block ffmpeg-free on aarch64 Encoding with ffmpeg rather than os-autoinst's built-in encoder gives us less broken videos, but on aarch64 it seems to cause problems, especially on stg's old, busted worker hosts - I think it's more CPU-intensive and they just can't handle the load. So, let's block it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-03-26 15:07:27 -07:00
Adam Williamson	da391c4ba2	openQA: trim default routing keys for scheduler consumer With Bodhi 8 we no longer need to listen to request.testing or update.edit messages. Yay. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-02-09 12:25:55 -08:00
Adam Williamson	d3b6d1bafd	openqa/worker: install ffmpeg-free I'm adding this as a Recommends: for os-autoinst, but want to get it on the workers now. Having it installed gives us better videos of test runs (the internal video encoder is a bit wonky and produces videos that have errors which make jumping around within the video not work properly). Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-01-03 10:38:11 -08:00
Adam Williamson	1cb0c0cdc6	Put openqa-lab-repo.repo in worker role as well as server role Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-10-27 18:09:24 -07:00
Adam Williamson	504b8217d3	openqa etc.: use pip for local installs, not setuptools On Fedora 39, we ran into an issue with setuptools that isn't immediately resolvable: https://github.com/pypa/setuptools/issues/3797#issuecomment-1783613895 using pip like this seems to avoid it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-10-27 17:23:53 -07:00
Adam Williamson	530f69d967	openqa: use an external side repo for test builds It's overall simpler and more idempotent to just use a side repo maintained outside of ansible than re-create one on each system on each run of the plays. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-10-27 11:20:58 -07:00
Adam Williamson	374956365e	openqa: drop the results_min_free_disk_space_percentage cleanup It is extremely slow to run, and we figured out that the problem on openqa01 was excessive space being used by Netapp snapshots, so we don't need this any more. It was actually deleting old jobs before their time, because it had already wiped every video file and didn't know what else to do... Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-07-25 15:13:07 -07:00
Adam Williamson	1e26a28c2c	openqa/server: try setting a limit on test result disk usage We're having issues with test results eating up all the disk space we can throw at them (prod is over 4T, stg is over 2T - I don't know why prod is bigger, that's odd, but it may be an odd effect of having more arches on stg, maybe aarch64 and ppc64le tests generally have smaller videos, or something). This config setting should make openQA keep the space usage on the partition at a max of 85%, by deleting videos from older tests as required. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-07-21 10:19:54 -07:00
Adam Williamson	a5c322b4ee	More cleanup on the openQA AMQP stuff nirik and I went around and around a bit today and ended up back where we started, but with a clearer understanding of where that this. This explains it a bit better, and makes what's actually going on in various places clearer with the use of appropriate shared variables. This should not actually change anything at all when deployed. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-06-22 23:21:28 +02:00
Adam Williamson	de979123fa	openQA: don't install the fedoraupdaterestart plugin any more We don't need it, we use upstream RETRY now. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-12-19 16:16:11 -08:00

1 2 3 4 5 ...

425 Commits