Aurélien Bompard
f185573c41
Do stuff on iad2_internal also on rdu3_internal
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2025-06-23 19:02:44 +02:00
Aurélien Bompard
d22bde741d
Nagios: template the mail_queue.cfg file
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2025-06-23 18:11:28 +02:00
Aurélien Bompard
9007df7619
Don't change the template name, or it will be the name of the remote file
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2025-06-23 10:27:03 +02:00
Kevin Fenzi
aeaa0811c4
nagios: fix task to match the real template name
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-06-21 11:45:49 -07:00
Kevin Fenzi
ad3533e506
nagios: try and split out all hostgroups into _iad2 and _rdu3
...
We want to monitor iad2 from noc01.iad2 and rdu3 from noc01.rdu3, so
try and split this out into seperate all groups for each datacenter.
This will likely miss some things that aren't split out into seperate
_iad2 and _rdu3 groups, but we can hopefully fix those.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-06-21 11:26:38 -07:00
Kevin Fenzi
7113cce4ec
nagios_server: fix missing = in when
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-06-21 11:06:35 -07:00
Kevin Fenzi
3be2d89e66
nagios: also add these templates in rdu3
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-06-20 22:34:45 -07:00
Kevin Fenzi
4ca8fa862c
nagios: adjust when clause for rdu3
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-06-20 22:11:57 -07:00
Kevin Fenzi
a42481a782
nagios/rdu3: need templates and other config in rdu3 also
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-06-20 21:49:17 -07:00
Kevin Fenzi
449385c8b0
nagios: move rdu3 hosts over to noc01.rdu3
...
Also open firewalls to allow noc03.rdu3 to access them.
Also enable nagios_server on noc01.rdu3.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-06-20 20:29:24 -07:00
Kevin Fenzi
2b3441492a
nagios: add rdu3-hosts template to be deployed
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-06-20 18:05:31 -07:00
Michal Konecny
f63e839698
[nagios-server] Move the datanommer checks to noc01
...
There were few fedora-messaging datanommer checks that were running on
busgateway01. As this machine is part of fedmsg it will be
decommissioned. Let's move the checks to noc01.
Signed-off-by: Michal Konecny <mkonecny@redhat.com >
2025-02-14 09:45:39 +00:00
Michal Konecny
6428f8f772
Sunset github2fedmsg and fedmsg
...
This commit is removing all the fedmsg related stuff from ansible
repository.
Signed-off-by: Michal Konecny <mkonecny@redhat.com >
2025-02-13 10:08:51 +00:00
Michal Konecny
2ec055db6f
Use first uppercase letter for all handlers
...
This will unify all the handlers to use first uppercase letter for
ansible-lint to stop complaining.
I went through all `notify:` occurrences and fixed them by running
```
set TEXT "text_to_replace"; set REPLACEMENT "replacement_text"; git grep
-rlz "$TEXT" . | xargs -0 sed -i "s/$TEXT/$REPLACEMENT/g"
```
Then I went through all the changes and removed the ones that wasn't
expected to be changed.
Fixes https://pagure.io/fedora-infrastructure/issue/12391
Signed-off-by: Michal Konecny <mkonecny@redhat.com >
2025-02-10 20:31:49 +00:00
Kevin Fenzi
22f3d8832f
handlers: more renaming fixes
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2025-01-24 14:06:11 -08:00
Ryan Lerch
47c68f478d
ansiblelint fixes - fqcn[action-core] - template to ansible.builtin.template
...
Replaces references to template: with ansible.builtin.template
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2025-01-15 11:30:29 +10:00
Ryan Lerch
25391e95b7
ansiblelint fixes - fqcn[action-core] - package to ansible.builtin.package
...
Replaces many references to package: with ansible.builtin.package
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2025-01-15 11:28:00 +10:00
Ryan Lerch
462176464b
ansiblelint fixes-- fqcn[action-core] - command to ansible.builtin.command
...
Replaces many references to command: with ansible.builtin.command
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2025-01-15 11:26:47 +10:00
Ryan Lerch
6a3816dfdc
ansiblelint fixes-- fqcn[action-core] - copy to ansible.builtin.copy
...
Replaces many references to 'copy' with ansible.builtin.copy
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2025-01-15 10:43:31 +10:00
Ryan Lerch
62952df107
ansiblelint fixes-- fqcn[action-core] - file to ansible.builtin.file
...
Replaces many references to file: with ansible.builtin.file
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2025-01-15 10:41:52 +10:00
Ryan Lerch
691adee6ee
Fix name[casing] ansible-lint issues
...
fix 1900 failures of the following case issue:
`name[casing]: All names should start with an uppercase letter.`
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2025-01-14 20:20:07 +10:00
Ryan Lerch
89f6f1fc32
Fix majority of remaining yamllint warnings and errors
...
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2024-11-28 17:31:45 +10:00
Kevin Fenzi
ef8a734d69
nagios: also make sure the service is running and enabled
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-11-21 12:53:00 -08:00
Kevin Fenzi
160a909053
noc: install ipmitool as well
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-11-19 13:22:48 -08:00
Kevin Fenzi
c84b99223c
osbs: raise a glass for it's service
...
This removes osbs and allmost all it's associated playbooks and files.
It served long and well, but we no longer need it.
flatpaks are building with a koji-flatpak plugin.
base/minimal/toolbox containers are building with kiwi.
We aren't building any other containers right now, and we did they could
be added to kiwi.
This is the end of an era... I look with nostolga on
ansible-ansible-openshift-ansible (a role to setup ansible on a control
host and run it from our ansible).
Good bye osbs!
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-03-28 12:52:07 -07:00
Leo Puvilland
18e4f51c61
Make only the nagios group able to execute the matrix-notify script
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-21 14:46:02 -08:00
Leo Puvilland
11b56e8551
Fix path to Matrix-Notify script
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-20 10:29:15 -08:00
Leo Puvilland
e04948b31a
Fix template file not being copied (matrix-notify script)
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-19 09:18:16 -08:00
Leo Puvilland
05bff0da9f
nagios matrix notify: use full filename for script in role
...
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2023-12-19 10:13:23 +10:00
Leo Puvilland
48d7982ebf
Correct syntax error in nagios role
...
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2023-12-19 10:08:07 +10:00
Leo Puvilland
5aafc6a1d2
Move nagios notifications to Matrix
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-18 15:55:30 -08:00
Kevin Fenzi
d727ee47ea
nagios: remove another old notifs remnant
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-15 13:12:36 -08:00
Kevin Fenzi
fdf34aab57
nagios_server / noc02: set seboolean to allow certgetter to work
...
noc02 needs to be able to proxy to certgetter for the acme challenge for
ssl certs. So, set this there to allow that.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-08 15:40:23 -08:00
Kevin Fenzi
22dde8163b
unbound: remove and retire unbound servers
...
These instances served long and well as fallback resolvers for
dnssec-trigger. This is no longer needed or used, so lets remove them.
See https://pagure.io/fedora-infrastructure/issue/11415
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-07-24 14:40:43 -07:00
Kevin Fenzi
71cdddf55b
nagios: move the ipv6 specific ping config to a ping-ipv6.cfg file
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2022-11-17 16:39:11 -08:00
Kevin Fenzi
28fc20056a
nagios: fix typo
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2022-11-17 16:30:06 -08:00
Kevin Fenzi
b9b35a09ed
nagios: move ping.cfg to a template so it works for both nagios servers
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2022-11-17 16:19:50 -08:00
Stephen Smoogen
e36f982263
This should allow for ansible to build correctly the templates for noc01/noc02.
2022-11-17 12:06:00 -05:00
Seddik Alaoui Ismaili
9af427e1bf
add ipv6 check for fedorapeople
2022-11-17 01:40:25 +00:00
Kevin Fenzi
b388a003b4
nagios: add checks for ssl certs on fcos and ocp4 endpoints, change to just checking proxy01
...
Add checks for ssl certs on fcos openshift endpoints.
Add checks for ocp4 wildcard certs.
Change check to only use proxy01/proxy01.stg instead of all proxies.
Ideally we really do want to check all proxies, but in practice this
results in like 70 alerts anytime the cert is going to expire.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2022-02-02 15:47:23 -08:00
Silvie Chlupova
6fa2999dbf
copr: use already existing copr.cfg
2022-01-20 13:23:31 +01:00
Silvie Chlupova
8c5dc50c7e
copr: move copr nagios services into separate file
2022-01-20 12:14:48 +01:00
Pavel Raiskup
6062eec80b
nagios: drop copr_external.cfg from services
2021-08-10 09:59:23 +02:00
Pavel Raiskup
d2f9b772e9
nagios: move copr-ping to internal
2021-08-10 08:51:55 +02:00
Pavel Raiskup
f76859775c
nagios: pick up copr_external.cfg services
2021-08-09 13:50:30 +02:00
Pavel Raiskup
29fb33bbb7
copr-be: test remaining results storage space
2021-07-28 13:51:16 +02:00
Rick Elrod
dcc53bd63b
add crl check to nagios + nrpe + facl perms for nrpe
...
Signed-off-by: Rick Elrod <relrod@redhat.com >
2020-08-06 15:32:09 -05:00
Kevin Fenzi
6371dd26c3
nagios / server: fix check_koji plugin name
...
As it was it copied the check_koji.j2 template in ansible to
check_koji.j2 on the server, which meant that check_koji the actual
script wasn't on noc01 and the check couldn't work.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2020-07-09 16:49:07 -07:00
Stephen Smoogen
35f1746c3f
things become clearer when we find a missing internal on soemthing that says for internal
2020-07-01 18:20:14 -04:00
Stephen Smoogen
6e218c7031
a box not on the vpn has a hard time testing for boxes on the vpn
2020-07-01 18:14:02 -04:00