fedora-infra_ansible

mirror of https://pagure.io/fedora-infra/ansible.git synced 2026-05-03 18:43:50 +08:00

Author	SHA1	Message	Date
Adam Williamson	4743c3fdce	openqa/worker: transition all tap workers to NM-based setup This seems to be working fine in testing, so let's deploy it everywhere. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-07-25 14:54:03 -07:00
Adam Williamson	762b23ef7d	openqa/worker tap-setup-nm: tweak some quoting, drop tunctl Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-07-25 14:06:52 -07:00
Adam Williamson	690a5eb951	openqa/worker: add NM-based tap setup and test on p09-worker01 network-scripts-openvswitch was removed in f40 and network-scripts is going away in f41; we really need to get off using them. This attempts to implement the same setup using NetworkManager, based on a few different NM/ovs references, and the source of openQA upstream's os-autoinst-setup-multi-machine . It might need a bit of tweaking, so for now, we make it a separate task and use it only on p09-worker01 for testing. This doesn't handle tearing down the old network-scripts-based config as that's pretty complex and will only need to happen once; I'll do it manually before trying this out. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-07-25 13:50:39 -07:00
Adam Williamson	7f73f1253e	openqa/worker: don't explicitly pull in ffmpeg-free on aarch64 We don't want it there - see earlier commits - but I didn't notice it's actually explicitly listed here for all arches, which breaks stuff on aarch64 now we told dnf to exclude it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-06-11 11:08:36 -07:00
Adam Williamson	44c5c79ad7	openqa/dispatcher: add a cron job to send missed test results This works around an annoying problem where, for some reason, we sometimes just miss sending completed test results to resultsdb. I've never been able to figure out why this happens, but this should band-aid it by looking, daily, for updates stuck in waiting gating status, checking for cases where a test finished but we didn't send a result, and sending it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-05-27 15:25:25 -07:00
Adam Williamson	826fd32330	openqa/worker: don't force createhdds off non-standard branch Using the same approach as we do for the tests and fedora_openqa. I wish I'd done this before I ran the playbook on lab and it wiped every...single...goddamn...disk image. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-03-26 15:35:33 -07:00
Adam Williamson	7cfe3d61e6	openqa/worker: block ffmpeg-free on aarch64 Encoding with ffmpeg rather than os-autoinst's built-in encoder gives us less broken videos, but on aarch64 it seems to cause problems, especially on stg's old, busted worker hosts - I think it's more CPU-intensive and they just can't handle the load. So, let's block it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-03-26 15:07:27 -07:00
Adam Williamson	da391c4ba2	openQA: trim default routing keys for scheduler consumer With Bodhi 8 we no longer need to listen to request.testing or update.edit messages. Yay. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-02-09 12:25:55 -08:00
Adam Williamson	d3b6d1bafd	openqa/worker: install ffmpeg-free I'm adding this as a Recommends: for os-autoinst, but want to get it on the workers now. Having it installed gives us better videos of test runs (the internal video encoder is a bit wonky and produces videos that have errors which make jumping around within the video not work properly). Signed-off-by: Adam Williamson <awilliam@redhat.com>	2024-01-03 10:38:11 -08:00
Adam Williamson	1cb0c0cdc6	Put openqa-lab-repo.repo in worker role as well as server role Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-10-27 18:09:24 -07:00
Adam Williamson	504b8217d3	openqa etc.: use pip for local installs, not setuptools On Fedora 39, we ran into an issue with setuptools that isn't immediately resolvable: https://github.com/pypa/setuptools/issues/3797#issuecomment-1783613895 using pip like this seems to avoid it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-10-27 17:23:53 -07:00
Adam Williamson	530f69d967	openqa: use an external side repo for test builds It's overall simpler and more idempotent to just use a side repo maintained outside of ansible than re-create one on each system on each run of the plays. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-10-27 11:20:58 -07:00
Adam Williamson	374956365e	openqa: drop the results_min_free_disk_space_percentage cleanup It is extremely slow to run, and we figured out that the problem on openqa01 was excessive space being used by Netapp snapshots, so we don't need this any more. It was actually deleting old jobs before their time, because it had already wiped every video file and didn't know what else to do... Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-07-25 15:13:07 -07:00
Adam Williamson	1e26a28c2c	openqa/server: try setting a limit on test result disk usage We're having issues with test results eating up all the disk space we can throw at them (prod is over 4T, stg is over 2T - I don't know why prod is bigger, that's odd, but it may be an odd effect of having more arches on stg, maybe aarch64 and ppc64le tests generally have smaller videos, or something). This config setting should make openQA keep the space usage on the partition at a max of 85%, by deleting videos from older tests as required. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-07-21 10:19:54 -07:00
Adam Williamson	a5c322b4ee	More cleanup on the openQA AMQP stuff nirik and I went around and around a bit today and ended up back where we started, but with a clearer understanding of where that this. This explains it a bit better, and makes what's actually going on in various places clearer with the use of appropriate shared variables. This should not actually change anything at all when deployed. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2023-06-22 23:21:28 +02:00
Adam Williamson	de979123fa	openQA: don't install the fedoraupdaterestart plugin any more We don't need it, we use upstream RETRY now. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-12-19 16:16:11 -08:00
Adam Williamson	f1e0e0d037	Fix openqa_tap truthiness checks Sigh, \|bool doesn't do what you might think it does: https://medium.com/opsops/wft-bool-filter-in-ansible-e7e2fd7a148f Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-11-25 14:58:36 -08:00
Adam Williamson	28110d34be	openqa/worker: prepare to handle multiple tap worker classes I'm going to try splitting the tap jobs across multiple worker hosts. We have quite a lot of tap jobs, now, and I have seen sometimes a situation where all non-tap jobs have been run and the non-tap worker hosts are sitting idle, but the single tap worker host has a long queue of tap jobs to get through. We can't just put multiple hosts per instance into the tap class, because then we might get a case where job A from a tap group is run on one host and job B from a tap group is run on a different host, and they can't communicate. It's actually possible to set this up so it works, but it needs yet more complex networking stuff I don't want to mess with. So instead I'm just gonna split the tap job groups across two classes, 'tap' and 'tap2'. That way we can have one 'tap' worker host and one 'tap2' worker host per instance and arch, and they will each get about half the tap jobs. Unfortunately since we only have one aarch64 worker for lab it will still have to run all the jobs, but for all other cases we do have at least two workers, so we can split the load. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-11-25 14:11:58 -08:00
Adam Williamson	1c95ec9a35	Revert "openQA: set higher LimitRequestLine in httpd vhost config" This reverts commit `892453da7e`. openQA still had problems with the very long request, so I just did an ugly hack to get the request under the limit instead.	2022-10-21 17:12:15 -07:00
Adam Williamson	892453da7e	openQA: set higher LimitRequestLine in httpd vhost config The openQA job scheduler was hitting 414 errors today because an update has so many builds there are more than 8190 characters (the default limit) in the POST request. Let's bump the limit to 16000. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-10-21 08:38:05 -07:00
Adam Williamson	f122367c34	openqa/worker: change name on kernel override config file It really needs to be called exactly 60-block-scheduler.rules as it's overriding a file of the same name in `/usr`. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-08-17 14:28:18 -04:00
Adam Williamson	ca8e3db401	Add file missed from previous commit Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-08-17 13:46:16 -04:00
Adam Williamson	5b49611201	openqa/worker: override kernel scheduler config This applies only within Fedora infra for now, as we're not sure whether worker hosts on different hardware hit this bug. It's intended to work around: https://bugzilla.redhat.com/show_bug.cgi?id=2009585 a bug which results in the infra worker hosts hanging after a short time when running kernels newer than 5.11. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-08-17 13:34:15 -04:00
Adam Williamson	8e891fe4d5	openqa/server: update for git default branch rename Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-07-14 11:58:09 -07:00
Adam Williamson	7ba67fdc12	openQA: don't enable FedoraUpdateRestart plugin Upstream implemented a feature that we can use to do the same thing using just a test variable, so we're switching to that. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-07-06 10:42:26 -07:00
Adam Williamson	a91dfc29e9	openqa: twiddle with the delegation stuff again Ugh, we delegate for the assetsize stuff too and there's tons of that, splitting it would be awful. Let's try a different approach with a new optional variable for the delegate target. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-06-07 16:32:04 -07:00
Adam Williamson	42e930e97f	openqa-onebox: tweak db host stuff Using the machine's own hostname works for the ansible delegate stuff but doesn't work for openQA itself (if you try and access the DB by hostname like this, postgres denies access; you have to use 'localhost' for postgres to allow it). Using 'localhost' works for postgres but doesn't do the right thing for delegation. Let's use 'localhost' and split the two play steps into delegated and non-delegated versions. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-06-07 16:17:29 -07:00
Adam Williamson	d227dac859	openqa/dispatcher: don't require resultsdb and wiki URL/hostname The config file should treat these as optional, not every openQA instance wants to report results. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-06-07 15:21:15 -07:00
Adam Williamson	4fd83483fa	openqa/dispatcher: comment on the needed packages Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-06-07 15:21:15 -07:00
Adam Williamson	ccf3b23cd4	openqa/server: skip openqa.ini amqp section if vars not set We don't want to include this section if the vars aren't set. Not every openQA server has to be an AMQP publisher. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-06-07 15:21:15 -07:00
Adam Williamson	6c2991306c	openqa/server: only install nfs-utils when needed If there are no NFS workers, we don't need the NFS server. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-06-07 15:21:15 -07:00
Adam Williamson	0cf8a59fd5	openqa: fix openqa_nfs_{worker,client}s confusion again Missed from previous commit. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-06-07 13:26:22 -07:00
Adam Williamson	e5c5cc336f	openqa: fix confusion between openqa_nfs_{worker,client}s Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-06-07 13:26:01 -07:00
Adam Williamson	bf4f704096	openqa: improve how we do the git config thing The background to this is https://bugzilla.redhat.com/show_bug.cgi?id=2073414 , in response to which git was changed to die if a user runs git commands on a repo which it doesn't own. In openQA, the test directory is a git repo and openQA itself likes to run git commands on it, but this is often going to be as a different user than the owner of the directory. In fact on the worker hosts, the user that owns the directory (geekotest on the server box) doesn't even exist. This just sets the config by copying a file in place rather than running a git command (which is hard to get to be idempotent) and uses `/etc/gitconfig` so we don't wind up with a file in the _openqa-worker user's home directory, which is meant to be empty. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-05-27 10:24:34 -07:00
Adam Williamson	3d148f5e7f	openqa/worker: handle git 'safety' check for test dir Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-05-27 09:05:06 -07:00
Adam Williamson	f869c0f643	Revert "openqa/worker: handle git 'safety' check for test dir" This reverts commit `34b3d3a5cc`. On second thoughts it's kinda ugly and I need to think about other options...	2022-05-26 15:23:13 -07:00
Adam Williamson	34b3d3a5cc	openqa/worker: handle git 'safety' check for test dir Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-05-26 15:05:00 -07:00
Adam Williamson	b5be505576	openqa/server: don't hide ISO assets any more We were hiding these because in the past the only ISO assets were those from the compose under test, and we wanted to avoid people downloading them from openQA when we'd rather they get them from dl.fp.o or the mirror system. But these days we have tests that generate ISOs (update netinst and live image build tests) and we often want to download the generated images to test them locally. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-05-25 09:12:10 -07:00
Adam Williamson	e6e0e2f42d	openqa: set up for new resultsdb location and auth on lab This sets up the openQA lab instance to report to the new stg instance of resultsdb, and use authentication. The scheduler config file is now mode 0600 because it has a password in it. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-05-11 17:06:35 -07:00
Adam Williamson	58dd80c799	openqa/server: reduce PPC update group asset size We need to treat it and the x86_64 update group separately to do this, but it really doesn't need 200G. We have images from three weeks ago, and we don't need that kind of buffer, and space is a bit tight. Note: there is no aarch64 updates group as we do not currently run updates tests on aarch64. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2022-03-22 16:17:17 -07:00
Adam Williamson	3dec01a15a	openqa/server: set httpd_can_network_connect boolean again :( Seems there's one more port that needs to be tagged before we can finally unset this: https://bugzilla.redhat.com/show_bug.cgi?id=1277312#c9 Keep the custom policy as well, though, so we just need to update it when that port gets done. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-12-14 16:33:19 -08:00
Adam Williamson	2320eef5ee	openqa/worker: create custom SELinux module directory first Whoops. Also order these things a bit better. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-12-14 15:54:38 -08:00
Adam Williamson	edc4caa833	openqa/server: use custom SELinux policy instead of boolean We've been using the httpd_can_network_connect boolean for years to allow httpd to connect to the openQA server processes. This is an unnecessarily large hammer when we only need it to be able to connect to exactly the two openQA ports. This uses a custom SELinux policy to allow connecting to those ports only, and ensures the boolean is set back to off. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-12-14 15:48:34 -08:00
Adam Williamson	67eb9bb288	openqa/server: clean up and trim package requirements Several of these requirements are old ones that were only needed for createhdds, when we ran createhdds on the servers. All of those can go. Also make the list line-by-line for easier git blame tracking in future (and add comments for the remaining entries so we know why they're there). Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-12-14 14:43:29 -08:00
Adam Williamson	38888162ea	openQA: remove swtpm-teardown now the work is done Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-12-06 14:18:46 -08:00
Adam Williamson	7a5d7f59fb	openQA: Drop already-done step from swtpm-teardown This is just cleaning up the mess of the bad parameter from earlier, run of this play broke halfway through, need to do the remaining half without choking on this part. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-12-06 14:12:43 -08:00
Adam Williamson	ca2684c711	openQA: fix stupid semodule argument gah. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-12-06 14:05:14 -08:00
Adam Williamson	224e28131d	openQA: prepare for prod deployment of latest releases This unifies prod and stg onto the ways of doing things for the latest packages, and rejigs the swtpm stuff a bit to tear down more (we shouldn't need the custom SELinux policy any more). Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-12-06 10:40:33 -08:00
Adam Williamson	55be7c05f6	openQA: update AMQP config settings for lab These need to change with the newer version of openQA. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-11-30 10:30:20 -08:00
Adam Williamson	5889d3a9ae	openQA: untag the swtpm-teardown task for stg now it's run Keeping it around to run on prod when needed, then we'll take it out. Signed-off-by: Adam Williamson <awilliam@redhat.com>	2021-11-26 14:03:33 -08:00

1 2 3 4 5 ...

391 Commits