Commit Graph

1077 Commits

Author SHA1 Message Date
Stephen Smoogen
a0397d7abb Add blocks to nagios.conf httpd
I forgot I am the expert on nagios configs so added it to the template
file.

Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2024-07-09 09:18:56 +00:00
Kevin Fenzi
2397e3fbc4 mirrormanager: remove no longer needed nagios check for frontend
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-07-01 14:37:55 -07:00
Kevin Fenzi
4bcbc54efa people: retire people02
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-06-27 15:38:03 -07:00
James Antill
d7258e320e Add DNF countme nagios checks.
Signed-off-by: James Antill <james@and.org>
2024-06-27 17:35:23 +00:00
Kevin Fenzi
84a7a7afc8 nagios: adjust nrpe for badges vs old fedbadges
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-28 13:54:53 -07:00
Kevin Fenzi
71d5c496d4 nagios: fix badges monitoring check in nagios
This changed from 'fedbadges' to 'badges'.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-28 13:07:21 -07:00
Kevin Fenzi
d366194a22 module-build-service (mbs): retire service
With the EOL of Fedora 38 yesterday, we are no longer building any
modules and can retire our module build service.

Note that toddlers needs to be adjusted still, that will happen after
this.

Thanks for all the modules!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-22 13:38:53 -07:00
Ryan Lerch
675f400fdf Add ryanlerch to nagios commands lists
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2024-05-16 10:59:50 +10:00
Kevin Fenzi
e472e0c1b6 noc / badges: remove another old vm monitoring remnant
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-06 13:40:44 -07:00
Kevin Fenzi
ce72533001 nagios / badges: remove old fedmsg checks
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-06 13:11:59 -07:00
Leo Puvilland
5e59e8c213 add current oncall and recent oncalls to nagios permissions CGI
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-04-25 00:17:29 +00:00
Kevin Fenzi
c84b99223c osbs: raise a glass for it's service
This removes osbs and allmost all it's associated playbooks and files.

It served long and well, but we no longer need it.
flatpaks are building with a koji-flatpak plugin.
base/minimal/toolbox containers are building with kiwi.
We aren't building any other containers right now, and we did they could
be added to kiwi.

This is the end of an era... I look with nostolga on
ansible-ansible-openshift-ansible (a role to setup ansible on a control
host and run it from our ansible).

Good bye osbs!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-03-28 12:52:07 -07:00
Leo Puvilland
daa5e252cc nagios: fix stray ampersand that was breaking the curl command
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-02-23 18:51:32 -08:00
Leo Puvilland
fac5a39208 nagios: add parameter for which nagios host is sending the alert
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-02-22 00:44:19 +00:00
Kevin Fenzi
f95712d8a0 nagios / koji: drop ssl cert check
This check was from long ago when koji used a self signed cert/ca
It still amusingly has that configured, so this check is telling us that
that self signed cert that we dont use anymore is expiring. :)
So, just drop this, koji is being proxies now and uses our main wildcard
cert.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-02-13 10:13:48 -08:00
Leo Puvilland
172a57c0cf nagios: remove serviceackauthor from host notifications
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-01-24 03:34:52 +00:00
Leo Puvilland
c2b5cf45ac Switch to SERVICESTATE instead of HOSTSTATE in notify.cfg
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2024-01-08 21:59:13 +00:00
Leo Puvilland
18e4f51c61 Make only the nagios group able to execute the matrix-notify script
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-21 14:46:02 -08:00
Leo Puvilland
00d82f8610 Add matrix-bot to ircbot contactgroup
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-20 15:35:19 -08:00
Leo Puvilland
11b56e8551 Fix path to Matrix-Notify script
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-20 10:29:15 -08:00
Leo Puvilland
e04948b31a Fix template file not being copied (matrix-notify script)
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-19 09:18:16 -08:00
Leo Puvilland
05bff0da9f nagios matrix notify: use full filename for script in role
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2023-12-19 10:13:23 +10:00
Leo Puvilland
48d7982ebf Correct syntax error in nagios role
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
2023-12-19 10:08:07 +10:00
Leo Puvilland
5aafc6a1d2 Move nagios notifications to Matrix
Signed-off-by: Leo Puvilland <leo@craftcat.dev>
2023-12-18 15:55:30 -08:00
Kevin Fenzi
2524e7c258 nagios: stop trying to monitor start.fedoraproject.org, as its now under fedoraproject.org/start
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-30 14:52:03 -08:00
Kevin Fenzi
f42ce93d85 nagios: remove missed value01 reference
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-16 13:54:09 -08:00
Kevin Fenzi
d727ee47ea nagios: remove another old notifs remnant
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-15 13:12:36 -08:00
Kevin Fenzi
20dc948173 notifs (old fmn): retire
We are retiring this in favor of the new service.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-15 12:28:28 -08:00
Kevin Fenzi
3808d867de value01/value01.stg: retire
These are old rhel7 instances. The only thing left on them is fedmsg-irc
(sending to one irc channel, fedora-releng). Move everything to use the
newer rhel8 value02 instead.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-15 12:13:38 -08:00
Kevin Fenzi
a60ca7159f nuancier: retire and remove from ansible
See https://pagure.io/fedora-infrastructure/issue/11371
This service is retired.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-15 10:44:00 -08:00
Pavel Raiskup
8e6de8396e nagios: send notifications to copr-team@redhat.com
Instead of separate members.  This is just to align with:
https://accounts.fedoraproject.org/group/copr-sig/
2023-11-13 15:32:26 +01:00
Kevin Fenzi
fdf34aab57 nagios_server / noc02: set seboolean to allow certgetter to work
noc02 needs to be able to proxy to certgetter for the acme challenge for
ssl certs. So, set this there to allow that.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-08 15:40:23 -08:00
Kevin Fenzi
f0e6442a27 noc: drop bodhi nagios alert group
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-09-21 14:01:18 -07:00
Pavel Raiskup
fdb5bc033e nagios_server: add Jiří Kyjovský as a point of contact 2023-09-08 08:08:03 +02:00
Adam Williamson
8286b8f6c8 Port check_nagios_notifications.py to Python 3
Saw from one of the emails this morning that this isn't running
because there's no python2 on whatever system it was trying to
run on. This ports it to Python 3 (thanks, 2to3) and cleans up
the formatting (thanks, black). I tested it with a random sample
file I found lying around the internet -
https://github.com/bahamas10/node-nagios-status-parser/blob/master/status.dat
and it seems to do what it's supposed to do.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
2023-08-14 08:58:54 -07:00
Kevin Fenzi
22dde8163b unbound: remove and retire unbound servers
These instances served long and well as fallback resolvers for
dnssec-trigger. This is no longer needed or used, so lets remove them.
See https://pagure.io/fedora-infrastructure/issue/11415

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-07-24 14:40:43 -07:00
Pavel Raiskup
0944ac4ef3 copr-dist-git: decrease storage warning quota
With 5T storage, it is enough to warn on remaining 12%, and error on 6%.
2023-07-24 07:14:16 +02:00
Stephen Smoogen
7d7d0bf0a8 Remove smooge from various aliases
Currently, I (Stephen Smoogen) do not have the time to work on Fedora
system administration items. However, I get a lot of email and people
see my email address in various places to ping me for working on
things. I feel it would be better to remove myself from those places
and let Fedora Infrastructure add someone else to replace me when it
is possible to do so.

Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
2023-07-17 23:34:18 +00:00
Pavel Raiskup
09c7c868c6 copr-be: nagios: decrease the quota warning even more 2023-05-26 09:16:37 +02:00
Pavel Raiskup
5adcfbbbd6 copr-be: decrease the nagios warning quota to 10%, attempt #2
10% is still ~2.4T of free space, ATM it looks like enough to not start
the panic mode.

Complements: 2ed4e90feb
Fixes: https://github.com/fedora-copr/copr/issues/2737
2023-05-24 07:53:01 +02:00
Andrew Heath
9121258f52 reenable ansible nagios busgateway01 checks 2023-05-23 12:13:31 -04:00
Andrew Heath
9d3c107ef0 Disabling ansible check till we can troubleshoot 2023-05-19 20:07:41 +00:00
Andrew Heath
3600553301 removing nommer and fixing RPM sign 2023-05-19 20:07:41 +00:00
Kevin Fenzi
624f7545f0 Fare thee well 32bit arm. You served long and well.
Now that f36 is eol we don't need 32bit arm builders, test machines or
exceptions anywhere.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-05-16 17:05:14 -07:00
Aurélien Bompard
e1d3dcc491 Darn JS SPA
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2023-05-09 13:31:12 +02:00
Aurélien Bompard
5920da4334 FMN: fix the Nagios check again
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2023-05-09 10:05:25 +02:00
Aurélien Bompard
80c7b61487 FMN: update the nagios check
FMN is now running in OpenShift

Fixes: #11296

Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2023-05-09 09:14:25 +02:00
Aurélien Bompard
360e184862 FMN: move the old to -old and redirect to the new
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2023-04-26 10:55:25 +02:00
Pavel Raiskup
cb87003edc nagios_external: align icmp6 check with 5adeb88890 2023-04-26 09:24:45 +02:00
Pavel Raiskup
56c3f11a48 nagios: fix empty groups members in all-external.cfg.j2 2023-04-26 09:18:06 +02:00