Kevin Fenzi
289bda5698
nagios_client: install client on noc-cc01
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-08-01 16:34:40 -07:00
Michal Konecny
c8b62faaa4
[nagios_client] Fix for mailman api check
...
The mailman is now returning HTTP/1.1 instead of HTTP/1.0.
Signed-off-by: Michal Konecny <mkonecny@redhat.com >
2024-06-28 10:19:10 +02:00
James Antill
d7258e320e
Add DNF countme nagios checks.
...
Signed-off-by: James Antill <james@and.org >
2024-06-27 17:35:23 +00:00
Kevin Fenzi
84a7a7afc8
nagios: adjust nrpe for badges vs old fedbadges
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-05-28 13:54:53 -07:00
Kevin Fenzi
d366194a22
module-build-service (mbs): retire service
...
With the EOL of Fedora 38 yesterday, we are no longer building any
modules and can retire our module build service.
Note that toddlers needs to be adjusted still, that will happen after
this.
Thanks for all the modules!
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-05-22 13:38:53 -07:00
Kevin Fenzi
c84b99223c
osbs: raise a glass for it's service
...
This removes osbs and allmost all it's associated playbooks and files.
It served long and well, but we no longer need it.
flatpaks are building with a koji-flatpak plugin.
base/minimal/toolbox containers are building with kiwi.
We aren't building any other containers right now, and we did they could
be added to kiwi.
This is the end of an era... I look with nostolga on
ansible-ansible-openshift-ansible (a role to setup ansible on a control
host and run it from our ansible).
Good bye osbs!
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2024-03-28 12:52:07 -07:00
Seddik Alaoui Ismaili
c05bcd289f
remove pynag from check ipa replica
2024-02-27 13:16:46 +00:00
Kevin Fenzi
a60ca7159f
nuancier: retire and remove from ansible
...
See https://pagure.io/fedora-infrastructure/issue/11371
This service is retired.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-11-15 10:44:00 -08:00
Andrew Heath
96d9ed3d6b
Adding more checks for the fedmsg socket
2023-08-16 14:08:16 -04:00
Kevin Fenzi
0066f3cc68
proxies / fedmsg_monitoring: revert part of last config change
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-08-15 14:37:27 -07:00
Kevin Fenzi
1c0516c831
nagios_client: adjust fedmsg monitoring
...
Copy the fixes from exceptions monitoring to backlog.
Fix the calls that were passing a trailing - which isn't needed anymore.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-08-15 14:25:56 -07:00
Andrew Heath
c5daa84f53
Have script check for fedmsg socket
2023-08-15 21:18:18 +00:00
Kevin Fenzi
22dde8163b
unbound: remove and retire unbound servers
...
These instances served long and well as fallback resolvers for
dnssec-trigger. This is no longer needed or used, so lets remove them.
See https://pagure.io/fedora-infrastructure/issue/11415
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-07-24 14:40:43 -07:00
Pavel Raiskup
0944ac4ef3
copr-dist-git: decrease storage warning quota
...
With 5T storage, it is enough to warn on remaining 12%, and error on 6%.
2023-07-24 07:14:16 +02:00
Kevin Fenzi
314fa870a9
notifs-backend: fix check script and increase limits
...
The check_rabbitmq_size script seems to have critical and warning
backwards and is doing str comparisons when int should be used.
Also increase the limits a bunch as we don't want to be notified if it's
just backloged a bit.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2023-07-10 15:00:04 -07:00
Andrew Heath
a4ca219ed9
updating to point at monitoring-fedmsg-hub-3.socke
2023-05-30 15:49:45 -04:00
Andrew Heath
9a3bd882df
adding () per python3
2023-05-30 14:41:30 -04:00
Andrew Heath
de5ab8f045
updated the script to fix issues
2023-05-30 13:30:23 -04:00
Pavel Raiskup
09c7c868c6
copr-be: nagios: decrease the quota warning even more
2023-05-26 09:16:37 +02:00
Andrew Heath
9121258f52
reenable ansible nagios busgateway01 checks
2023-05-23 12:13:31 -04:00
Pavel Raiskup
2ed4e90feb
copr-be: decrease the nagios warning quota to 10%
...
10% is still ~2.4T of free space, ATM it looks like enough to not start
the panic mode.
https://github.com/fedora-copr/copr/issues/2737
2023-05-22 15:35:40 +02:00
Andrew Heath
9d3c107ef0
Disabling ansible check till we can troubleshoot
2023-05-19 20:07:41 +00:00
Andrew Heath
3600553301
removing nommer and fixing RPM sign
2023-05-19 20:07:41 +00:00
Andrew Heath
1bbd805e17
Remove remaining greenwave checks for busgateway
2023-04-11 17:32:49 +00:00
Pavel Raiskup
d0f7c7ca30
copr: use again a deterministic nrpe UID
...
It was notoriously colliding with other system users like copr-signer
and others.
Revert "copr: test without nrpe_client_uid specified"
This reverts commit 435b71a695 .
2022-11-22 10:54:00 +01:00
Pavel Raiskup
435b71a695
copr: test without nrpe_client_uid specified
...
Revert "copr: define nrpe_client_uid=500"
This reverts commit fa5cd7344c .
2022-11-22 10:41:26 +01:00
Pavel Raiskup
fa5cd7344c
copr: define nrpe_client_uid=500
2022-11-22 10:37:15 +01:00
Pavel Raiskup
baa6a0dff0
nagios_client: typo s/null/omit/
2022-11-22 10:25:54 +01:00
Pavel Raiskup
2627babd44
nagios_client: precreate nrpe client
...
With a specific UID if {{ nrpe_client_uid }} is defined.
2022-11-22 10:16:14 +01:00
Kevin Fenzi
18eecec303
nagios: adjust redhat.com email check
...
Right now there's often a backlog due to it not responding to the fedora
side. So, lets bump up the checks a bit so they do not alert all the
time.
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2022-10-12 12:45:48 -07:00
Aurélien Bompard
8962731dbc
Don't use datetime.fromtimestamp yet
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2022-05-24 18:37:27 +02:00
Aurélien Bompard
e979a1955e
Update the datanommer Nagios check to query datagrepper directly
...
Signed-off-by: Aurélien Bompard <aurelien@bompard.org >
2022-05-24 16:17:14 +00:00
Pavel Raiskup
120acfb3e7
copr-be: really setup the copr-be storage warning to 12%
...
The templates got de-synced.
2022-04-23 23:54:23 +02:00
Pavel Raiskup
3186e413d6
nagios/copr: monitor inodes (and one additional volume)
2022-02-08 22:53:43 +01:00
Mikolaj Izdebski
26c38caafa
nagios: Remove check for supybot fedmsg plugin
...
Zodbot no longer has fedmsg plugin installed - supybot-fedmsg package
is not installed on value02 (RHEL 8) and supybot-fedmsg upstream
project on GitHub has been archived.
2021-11-03 22:49:21 +00:00
Jakub Kadlcik
9a8acc79ae
nagios: enable disk monitoring for copr instances
...
I think that / monitoring should work by default just by
setting `nrpe: true` because of
define service {
hostgroup_name all, !mincheckgrp
service_description Disk_Space_/
check_command check_by_nrpe!check_disk_/
use disktemplate
}
2021-08-09 11:45:53 +00:00
Pavel Raiskup
29fb33bbb7
copr-be: test remaining results storage space
2021-07-28 13:51:16 +02:00
Pavel Raiskup
92ff0683f5
nrpe: check_disk order (almost) alphabetically
...
Without this, it was hard to tell if check_disk.cfg.j2 mirrors
nrpe.cfg.j2.
2021-07-28 13:41:26 +02:00
Michael Scherer
3b8504f293
Fix mention of Freenode
2021-07-02 11:17:20 +02:00
seddikalaouiismaili
ac9750d6a0
correct output message for nagios check
2021-06-07 23:48:23 +00:00
Francois Andrieu
d9fc78b0e4
nagios: remove MBSProducer check from mbs-backend
2021-05-21 18:58:14 +00:00
Francois Andrieu
9006cf784e
nagios: remove unused check_datanommer_faf
2021-05-21 18:57:09 +00:00
Kevin Fenzi
d890a9fbf4
bugzilla2fedmsg: drop checks against vm as it has moved to openshift
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2021-05-19 12:00:49 -07:00
Kevin Fenzi
740109a295
nagios_client / check_systemd_units: remove old debugging output
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2021-03-25 14:25:17 -07:00
Kevin Fenzi
cebb78ed82
nagios_client: the check_systemd_units is in scripts, not script
...
Signed-off-by: Kevin Fenzi <kevin@scrye.com >
2021-03-25 13:58:20 -07:00
seddikalaouiismaili
eae91f0d2b
install nrpe check for systemd units
2021-03-25 20:16:48 +00:00
Pierre-Yves Chibon
a32dabc92e
nagios_client: install the pagure systemd checks on all pagure instances
...
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr >
2021-02-12 12:37:26 +01:00
seddikalaouiismaili
890dd31cb0
script to monitor systemd units on pagure
2021-02-12 11:34:57 +00:00
Pierre-Yves Chibon
65c85dd5ec
nagios: Fix the check_supybot_pugin
...
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr >
2020-10-06 17:03:57 +02:00
Pierre-Yves Chibon
342c056ae4
nagios_client: Fix the check_ipa_replication plugin
...
It looks like the data it retrieves is in bytes and thus needs to be
decoded into a unicode string so we can use it as a regular string
in our code later.
Fixes https://pagure.io/fedora-infrastructure/issue/9372
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr >
2020-10-06 10:46:45 +02:00