Commit Graph

302 Commits

Author SHA1 Message Date
Kevin Fenzi
289bda5698 nagios_client: install client on noc-cc01
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-08-01 16:34:40 -07:00
Michal Konecny
c8b62faaa4 [nagios_client] Fix for mailman api check
The mailman is now returning HTTP/1.1 instead of HTTP/1.0.

Signed-off-by: Michal Konecny <mkonecny@redhat.com>
2024-06-28 10:19:10 +02:00
James Antill
d7258e320e Add DNF countme nagios checks.
Signed-off-by: James Antill <james@and.org>
2024-06-27 17:35:23 +00:00
Kevin Fenzi
84a7a7afc8 nagios: adjust nrpe for badges vs old fedbadges
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-28 13:54:53 -07:00
Kevin Fenzi
d366194a22 module-build-service (mbs): retire service
With the EOL of Fedora 38 yesterday, we are no longer building any
modules and can retire our module build service.

Note that toddlers needs to be adjusted still, that will happen after
this.

Thanks for all the modules!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-05-22 13:38:53 -07:00
Kevin Fenzi
c84b99223c osbs: raise a glass for it's service
This removes osbs and allmost all it's associated playbooks and files.

It served long and well, but we no longer need it.
flatpaks are building with a koji-flatpak plugin.
base/minimal/toolbox containers are building with kiwi.
We aren't building any other containers right now, and we did they could
be added to kiwi.

This is the end of an era... I look with nostolga on
ansible-ansible-openshift-ansible (a role to setup ansible on a control
host and run it from our ansible).

Good bye osbs!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-03-28 12:52:07 -07:00
Seddik Alaoui Ismaili
c05bcd289f remove pynag from check ipa replica 2024-02-27 13:16:46 +00:00
Kevin Fenzi
a60ca7159f nuancier: retire and remove from ansible
See https://pagure.io/fedora-infrastructure/issue/11371
This service is retired.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-11-15 10:44:00 -08:00
Andrew Heath
96d9ed3d6b Adding more checks for the fedmsg socket 2023-08-16 14:08:16 -04:00
Kevin Fenzi
0066f3cc68 proxies / fedmsg_monitoring: revert part of last config change
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-08-15 14:37:27 -07:00
Kevin Fenzi
1c0516c831 nagios_client: adjust fedmsg monitoring
Copy the fixes from exceptions monitoring to backlog.
Fix the calls that were passing a trailing - which isn't needed anymore.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-08-15 14:25:56 -07:00
Andrew Heath
c5daa84f53 Have script check for fedmsg socket 2023-08-15 21:18:18 +00:00
Kevin Fenzi
22dde8163b unbound: remove and retire unbound servers
These instances served long and well as fallback resolvers for
dnssec-trigger. This is no longer needed or used, so lets remove them.
See https://pagure.io/fedora-infrastructure/issue/11415

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-07-24 14:40:43 -07:00
Pavel Raiskup
0944ac4ef3 copr-dist-git: decrease storage warning quota
With 5T storage, it is enough to warn on remaining 12%, and error on 6%.
2023-07-24 07:14:16 +02:00
Kevin Fenzi
314fa870a9 notifs-backend: fix check script and increase limits
The check_rabbitmq_size script seems to have critical and warning
backwards and is doing str comparisons when int should be used.
Also increase the limits a bunch as we don't want to be notified if it's
just backloged a bit.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-07-10 15:00:04 -07:00
Andrew Heath
a4ca219ed9 updating to point at monitoring-fedmsg-hub-3.socke 2023-05-30 15:49:45 -04:00
Andrew Heath
9a3bd882df adding () per python3 2023-05-30 14:41:30 -04:00
Andrew Heath
de5ab8f045 updated the script to fix issues 2023-05-30 13:30:23 -04:00
Pavel Raiskup
09c7c868c6 copr-be: nagios: decrease the quota warning even more 2023-05-26 09:16:37 +02:00
Andrew Heath
9121258f52 reenable ansible nagios busgateway01 checks 2023-05-23 12:13:31 -04:00
Pavel Raiskup
2ed4e90feb copr-be: decrease the nagios warning quota to 10%
10% is still ~2.4T of free space, ATM it looks like enough to not start
the panic mode.

https://github.com/fedora-copr/copr/issues/2737
2023-05-22 15:35:40 +02:00
Andrew Heath
9d3c107ef0 Disabling ansible check till we can troubleshoot 2023-05-19 20:07:41 +00:00
Andrew Heath
3600553301 removing nommer and fixing RPM sign 2023-05-19 20:07:41 +00:00
Andrew Heath
1bbd805e17 Remove remaining greenwave checks for busgateway 2023-04-11 17:32:49 +00:00
Pavel Raiskup
d0f7c7ca30 copr: use again a deterministic nrpe UID
It was notoriously colliding with other system users like copr-signer
and others.

Revert "copr: test without nrpe_client_uid specified"

This reverts commit 435b71a695.
2022-11-22 10:54:00 +01:00
Pavel Raiskup
435b71a695 copr: test without nrpe_client_uid specified
Revert "copr: define nrpe_client_uid=500"

This reverts commit fa5cd7344c.
2022-11-22 10:41:26 +01:00
Pavel Raiskup
fa5cd7344c copr: define nrpe_client_uid=500 2022-11-22 10:37:15 +01:00
Pavel Raiskup
baa6a0dff0 nagios_client: typo s/null/omit/ 2022-11-22 10:25:54 +01:00
Pavel Raiskup
2627babd44 nagios_client: precreate nrpe client
With a specific UID if {{ nrpe_client_uid }} is defined.
2022-11-22 10:16:14 +01:00
Kevin Fenzi
18eecec303 nagios: adjust redhat.com email check
Right now there's often a backlog due to it not responding to the fedora
side. So, lets bump up the checks a bit so they do not alert all the
time.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-10-12 12:45:48 -07:00
Aurélien Bompard
8962731dbc Don't use datetime.fromtimestamp yet
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2022-05-24 18:37:27 +02:00
Aurélien Bompard
e979a1955e Update the datanommer Nagios check to query datagrepper directly
Signed-off-by: Aurélien Bompard <aurelien@bompard.org>
2022-05-24 16:17:14 +00:00
Pavel Raiskup
120acfb3e7 copr-be: really setup the copr-be storage warning to 12%
The templates got de-synced.
2022-04-23 23:54:23 +02:00
Pavel Raiskup
3186e413d6 nagios/copr: monitor inodes (and one additional volume) 2022-02-08 22:53:43 +01:00
Mikolaj Izdebski
26c38caafa nagios: Remove check for supybot fedmsg plugin
Zodbot no longer has fedmsg plugin installed - supybot-fedmsg package
is not installed on value02 (RHEL 8) and supybot-fedmsg upstream
project on GitHub has been archived.
2021-11-03 22:49:21 +00:00
Jakub Kadlcik
9a8acc79ae nagios: enable disk monitoring for copr instances
I think that / monitoring should work by default just by
setting `nrpe: true` because of

    define service {
      hostgroup_name	all, !mincheckgrp
      service_description   Disk_Space_/
      check_command		check_by_nrpe!check_disk_/
      use                   disktemplate
    }
2021-08-09 11:45:53 +00:00
Pavel Raiskup
29fb33bbb7 copr-be: test remaining results storage space 2021-07-28 13:51:16 +02:00
Pavel Raiskup
92ff0683f5 nrpe: check_disk order (almost) alphabetically
Without this, it was hard to tell if check_disk.cfg.j2 mirrors
nrpe.cfg.j2.
2021-07-28 13:41:26 +02:00
Michael Scherer
3b8504f293 Fix mention of Freenode 2021-07-02 11:17:20 +02:00
seddikalaouiismaili
ac9750d6a0 correct output message for nagios check 2021-06-07 23:48:23 +00:00
Francois Andrieu
d9fc78b0e4 nagios: remove MBSProducer check from mbs-backend 2021-05-21 18:58:14 +00:00
Francois Andrieu
9006cf784e nagios: remove unused check_datanommer_faf 2021-05-21 18:57:09 +00:00
Kevin Fenzi
d890a9fbf4 bugzilla2fedmsg: drop checks against vm as it has moved to openshift
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-05-19 12:00:49 -07:00
Kevin Fenzi
740109a295 nagios_client / check_systemd_units: remove old debugging output
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-03-25 14:25:17 -07:00
Kevin Fenzi
cebb78ed82 nagios_client: the check_systemd_units is in scripts, not script
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-03-25 13:58:20 -07:00
seddikalaouiismaili
eae91f0d2b install nrpe check for systemd units 2021-03-25 20:16:48 +00:00
Pierre-Yves Chibon
a32dabc92e nagios_client: install the pagure systemd checks on all pagure instances
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2021-02-12 12:37:26 +01:00
seddikalaouiismaili
890dd31cb0 script to monitor systemd units on pagure 2021-02-12 11:34:57 +00:00
Pierre-Yves Chibon
65c85dd5ec nagios: Fix the check_supybot_pugin
Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-06 17:03:57 +02:00
Pierre-Yves Chibon
342c056ae4 nagios_client: Fix the check_ipa_replication plugin
It looks like the data it retrieves is in bytes and thus needs to be
decoded into a unicode string so we can use it as a regular string
in our code later.

Fixes https://pagure.io/fedora-infrastructure/issue/9372

Signed-off-by: Pierre-Yves Chibon <pingou@pingoured.fr>
2020-10-06 10:46:45 +02:00