Stephen Smoogen
a0397d7abb
Add blocks to nagios.conf httpd
...
I forgot I am the expert on nagios configs so added it to the template
file.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com >
2024-07-09 09:18:56 +00:00
Kevin Fenzi
2397e3fbc4
mirrormanager: remove no longer needed nagios check for frontend
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-07-01 14:37:55 -07:00
Kevin Fenzi
4bcbc54efa
people: retire people02
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-06-27 15:38:03 -07:00
James Antill
d7258e320e
Add DNF countme nagios checks.
...
Signed-off-by: James Antill <james@and.org >
2024-06-27 17:35:23 +00:00
Kevin Fenzi
84a7a7afc8
nagios: adjust nrpe for badges vs old fedbadges
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-05-28 13:54:53 -07:00
Kevin Fenzi
71d5c496d4
nagios: fix badges monitoring check in nagios
...
This changed from 'fedbadges' to 'badges'.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-05-28 13:07:21 -07:00
Kevin Fenzi
d366194a22
module-build-service (mbs): retire service
...
With the EOL of Fedora 38 yesterday, we are no longer building any
modules and can retire our module build service.
Note that toddlers needs to be adjusted still, that will happen after
this.
Thanks for all the modules!
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-05-22 13:38:53 -07:00
Ryan Lerch
675f400fdf
Add ryanlerch to nagios commands lists
...
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2024-05-16 10:59:50 +10:00
Kevin Fenzi
e472e0c1b6
noc / badges: remove another old vm monitoring remnant
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-05-06 13:40:44 -07:00
Kevin Fenzi
ce72533001
nagios / badges: remove old fedmsg checks
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-05-06 13:11:59 -07:00
Leo Puvilland
5e59e8c213
add current oncall and recent oncalls to nagios permissions CGI
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2024-04-25 00:17:29 +00:00
Kevin Fenzi
c84b99223c
osbs: raise a glass for it's service
...
This removes osbs and allmost all it's associated playbooks and files.
It served long and well, but we no longer need it.
flatpaks are building with a koji-flatpak plugin.
base/minimal/toolbox containers are building with kiwi.
We aren't building any other containers right now, and we did they could
be added to kiwi.
This is the end of an era... I look with nostolga on
ansible-ansible-openshift-ansible (a role to setup ansible on a control
host and run it from our ansible).
Good bye osbs!
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-03-28 12:52:07 -07:00
Leo Puvilland
daa5e252cc
nagios: fix stray ampersand that was breaking the curl command
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2024-02-23 18:51:32 -08:00
Leo Puvilland
fac5a39208
nagios: add parameter for which nagios host is sending the alert
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2024-02-22 00:44:19 +00:00
Kevin Fenzi
f95712d8a0
nagios / koji: drop ssl cert check
...
This check was from long ago when koji used a self signed cert/ca
It still amusingly has that configured, so this check is telling us that
that self signed cert that we dont use anymore is expiring. :)
So, just drop this, koji is being proxies now and uses our main wildcard
cert.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-02-13 10:13:48 -08:00
Leo Puvilland
172a57c0cf
nagios: remove serviceackauthor from host notifications
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2024-01-24 03:34:52 +00:00
Leo Puvilland
c2b5cf45ac
Switch to SERVICESTATE instead of HOSTSTATE in notify.cfg
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2024-01-08 21:59:13 +00:00
Leo Puvilland
18e4f51c61
Make only the nagios group able to execute the matrix-notify script
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-21 14:46:02 -08:00
Leo Puvilland
00d82f8610
Add matrix-bot to ircbot contactgroup
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-20 15:35:19 -08:00
Leo Puvilland
11b56e8551
Fix path to Matrix-Notify script
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-20 10:29:15 -08:00
Leo Puvilland
e04948b31a
Fix template file not being copied (matrix-notify script)
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-19 09:18:16 -08:00
Leo Puvilland
05bff0da9f
nagios matrix notify: use full filename for script in role
...
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2023-12-19 10:13:23 +10:00
Leo Puvilland
48d7982ebf
Correct syntax error in nagios role
...
Signed-off-by: Ryan Lerch <rlerch@redhat.com >
2023-12-19 10:08:07 +10:00
Leo Puvilland
5aafc6a1d2
Move nagios notifications to Matrix
...
Signed-off-by: Leo Puvilland <leo@craftcat.dev >
2023-12-18 15:55:30 -08:00
Kevin Fenzi
2524e7c258
nagios: stop trying to monitor start.fedoraproject.org, as its now under fedoraproject.org/start
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-30 14:52:03 -08:00
Kevin Fenzi
f42ce93d85
nagios: remove missed value01 reference
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-16 13:54:09 -08:00
Kevin Fenzi
d727ee47ea
nagios: remove another old notifs remnant
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-15 13:12:36 -08:00
Kevin Fenzi
20dc948173
notifs (old fmn): retire
...
We are retiring this in favor of the new service.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-15 12:28:28 -08:00
Kevin Fenzi
3808d867de
value01/value01.stg: retire
...
These are old rhel7 instances. The only thing left on them is fedmsg-irc
(sending to one irc channel, fedora-releng). Move everything to use the
newer rhel8 value02 instead.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-15 12:13:38 -08:00
Kevin Fenzi
a60ca7159f
nuancier: retire and remove from ansible
...
See https://pagure.io/fedora-infrastructure/issue/11371
This service is retired.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-15 10:44:00 -08:00
Pavel Raiskup
8e6de8396e
nagios: send notifications to copr-team@redhat.com
...
Instead of separate members. This is just to align with:
https://accounts.fedoraproject.org/group/copr-sig/
2023-11-13 15:32:26 +01:00
Kevin Fenzi
fdf34aab57
nagios_server / noc02: set seboolean to allow certgetter to work
...
noc02 needs to be able to proxy to certgetter for the acme challenge for
ssl certs. So, set this there to allow that.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-08 15:40:23 -08:00
Kevin Fenzi
f0e6442a27
noc: drop bodhi nagios alert group
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-09-21 14:01:18 -07:00
Pavel Raiskup
fdb5bc033e
nagios_server: add Jiří Kyjovský as a point of contact
2023-09-08 08:08:03 +02:00
Adam Williamson
8286b8f6c8
Port check_nagios_notifications.py to Python 3
...
Saw from one of the emails this morning that this isn't running
because there's no python2 on whatever system it was trying to
run on. This ports it to Python 3 (thanks, 2to3) and cleans up
the formatting (thanks, black). I tested it with a random sample
file I found lying around the internet -
https://github.com/bahamas10/node-nagios-status-parser/blob/master/status.dat
and it seems to do what it's supposed to do.
Signed-off-by: Adam Williamson <awilliam@redhat.com >
2023-08-14 08:58:54 -07:00
Kevin Fenzi
22dde8163b
unbound: remove and retire unbound servers
...
These instances served long and well as fallback resolvers for
dnssec-trigger. This is no longer needed or used, so lets remove them.
See https://pagure.io/fedora-infrastructure/issue/11415
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-07-24 14:40:43 -07:00
Pavel Raiskup
0944ac4ef3
copr-dist-git: decrease storage warning quota
...
With 5T storage, it is enough to warn on remaining 12%, and error on 6%.
2023-07-24 07:14:16 +02:00
Stephen Smoogen
7d7d0bf0a8
Remove smooge from various aliases
...
Currently, I (Stephen Smoogen) do not have the time to work on Fedora
system administration items. However, I get a lot of email and people
see my email address in various places to ping me for working on
things. I feel it would be better to remove myself from those places
and let Fedora Infrastructure add someone else to replace me when it
is possible to do so.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com >
2023-07-17 23:34:18 +00:00
Pavel Raiskup
09c7c868c6
copr-be: nagios: decrease the quota warning even more
2023-05-26 09:16:37 +02:00
Pavel Raiskup
5adcfbbbd6
copr-be: decrease the nagios warning quota to 10%, attempt #2
...
10% is still ~2.4T of free space, ATM it looks like enough to not start
the panic mode.
Complements: 2ed4e90feb
Fixes: https://github.com/fedora-copr/copr/issues/2737
2023-05-24 07:53:01 +02:00
Andrew Heath
9121258f52
reenable ansible nagios busgateway01 checks
2023-05-23 12:13:31 -04:00
Andrew Heath
9d3c107ef0
Disabling ansible check till we can troubleshoot
2023-05-19 20:07:41 +00:00
Andrew Heath
3600553301
removing nommer and fixing RPM sign
2023-05-19 20:07:41 +00:00
Kevin Fenzi
624f7545f0
Fare thee well 32bit arm. You served long and well.
...
Now that f36 is eol we don't need 32bit arm builders, test machines or
exceptions anywhere.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-05-16 17:05:14 -07:00
Aurélien Bompard
e1d3dcc491
Darn JS SPA
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2023-05-09 13:31:12 +02:00
Aurélien Bompard
5920da4334
FMN: fix the Nagios check again
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2023-05-09 10:05:25 +02:00
Aurélien Bompard
80c7b61487
FMN: update the nagios check
...
FMN is now running in OpenShift
Fixes : #11296
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2023-05-09 09:14:25 +02:00
Aurélien Bompard
360e184862
FMN: move the old to -old and redirect to the new
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2023-04-26 10:55:25 +02:00
Pavel Raiskup
cb87003edc
nagios_external: align icmp6 check with 5adeb88890
2023-04-26 09:24:45 +02:00
Pavel Raiskup
56c3f11a48
nagios: fix empty groups members in all-external.cfg.j2
2023-04-26 09:18:06 +02:00