Commit Graph

123 Commits

Author SHA1 Message Date
Kevin Fenzi
231dbb29ec nagios: add some more hosts to rdu3_external
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2026-02-16 13:25:59 -08:00
Kevin Fenzi
0db48ee5ce nagios: add proxy03/14 to rdu3_external list so noc02 works
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2026-02-16 13:14:18 -08:00
Kevin Fenzi
737608a2e2 Revert "nagios / external: try and put pagure01 in rdu3_external to see if that makes noc02 happy"
This reverts commit 2d3797de65.

This just adds confusion, try reverting it for now.
2025-12-08 11:09:56 -08:00
Kevin Fenzi
2d3797de65 nagios / external: try and put pagure01 in rdu3_external to see if that makes noc02 happy
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-12-03 17:08:34 -08:00
Greg Sutcliffe
6c990442c9 Added buildhw-x86-08.rdu3.fedoraproject.org
Signed-off-by: Greg Sutcliffe <fedora@emeraldreverie.org>
2025-09-04 14:43:45 +01:00
Greg Sutcliffe
1b6daba47b Added buildhw-x86-09.rdu3.fedoraproject.org to inventory too
Signed-off-by: Greg Sutcliffe <fedora@emeraldreverie.org>
2025-09-03 16:29:40 +01:00
Greg Sutcliffe
4ac40e1cc3 Add buildhw-x86-10.rdu3.fedoraproject.org
Signed-off-by: Greg Sutcliffe <fedora@emeraldreverie.org>
2025-09-02 14:39:46 +01:00
Greg Sutcliffe
ce54370f13 Add buildhw-x86-12.rdu3.fedoraproject.org
Signed-off-by: Greg Sutcliffe <fedora@emeraldreverie.org>
2025-09-01 13:10:36 +01:00
Kevin Fenzi
c6ed99beb9 nagios: drop some trailing .s on entries that confuses nagios http_check plugin
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-07-24 13:36:52 -07:00
Greg Sutcliffe
43d29fc0bf Added buildhw-x86-13.rdu3.fedoraproject.org - in the other places
Signed-off-by: Greg Sutcliffe <fedora@emeraldreverie.org>
2025-07-24 15:58:13 +01:00
Kevin Fenzi
1b43d4160c nagios: fix duplicate mgmt host 01 that should have been 02
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-07-23 17:01:07 -07:00
Kevin Fenzi
b44c28e08d inventory: add some buildhw's to inventory/nagios
We want to monitor the a64 and x86 buildhw devices too.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-07-21 14:36:43 -07:00
Greg Sutcliffe
0d71c0bce0 Nagios: remove http check on p10 mgmt interface
Signed-off-by: Greg Sutcliffe <fedora@emeraldreverie.org>
2025-07-11 20:06:33 +00:00
Kevin Fenzi
233ec96688 inventory: drop non existant machines
These are various machines that are not yet deployed, or no longer exist
in rdu3 (though they did in iad2). This should clean up nagios
a fair bit and when/if we redeploy these we can add them back in.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-07-09 10:26:51 -07:00
Nils Philippsen
6c85fda0c9 Mass remove/replace iad2 -> rdu3, 10.3. -> 10.16.
Signed-off-by: Nils Philippsen <nils@redhat.com>
2025-07-03 20:05:02 +02:00
Kevin Fenzi
da9b97676e nagios: There is not a bvmhost-x86-05 in rdu3 staging
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-06-28 12:45:39 -07:00
Kevin Fenzi
58bdf975c0 dns: actually serve the rdu3 mgmt zone to requests for it instead of the iad2 one
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-06-23 15:53:20 -07:00
Kevin Fenzi
9cc7fce540 nagios: add rdu3_management hosts
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2025-06-20 21:13:25 -07:00
Michal Konecny
6428f8f772 Sunset github2fedmsg and fedmsg
This commit is removing all the fedmsg related stuff from ansible
repository.

Signed-off-by: Michal Konecny <mkonecny@redhat.com>
2025-02-13 10:08:51 +00:00
James Antill
9e32ac422e Remove bvmhost-a64 01-02, 07-13, 19-24.
Signed-off-by: James Antill <james@and.org>
2025-01-28 21:47:30 +00:00
iamyaash
b3d6a90b9a motd generic template added
migrated notes from infra/hosts

motd changes; excluding CSI infos

removed csi_* vars from group_vars; converted csi_purpose & csi_relationship into notes

fixed merge conflicts

minor changes; var

updating YAMLs & playbooks

udpated YAMLs & playbooks again

updated correctly; buildhw.yml

fixing merge conflicts

dest added in motd.yml
2025-01-28 01:10:14 +00:00
Seddik Alaoui Ismaili
3f7749fe23 remove logdetective01 from nagios 2025-01-09 17:31:56 +00:00
Kevin Fenzi
069e2cbc9f nagios: clear old retired hosts mgmt
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-08-27 12:04:23 -07:00
Kevin Fenzi
33627c2ada bvmhost-a64s moving to buildhw
We have these 7 emags that were bvmhosts running 32bit arm builders.
Since we no longer need those, lets repurpose them as aarch64 buildhw.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-07-16 17:32:35 -07:00
Kevin Fenzi
3b2853b5d4 nagios / staging: fix staging vmhost mgmt
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-04-18 14:20:03 -07:00
Kevin Fenzi
838338e312 IAD2 datacenter changes
There were folks on site this week to rack new machines/pull old
machines, and unfortunately we don't really have much control over when
this happens based on our freeze, so I am just pushing this as part of
the 'do whats required to handle an outage'.

We did the following changes:

- removed old autosign01 (was out of service as we moved to autosign02 a
  while ago)

- removed vmhost-x86-08/09. We also want to migrate off 07 soon and
  remove it next visit. A new vmhost-x86-08 is installed to replace
  these 3.

- removed vmhost-x86-03/04.stg. Added new vmhost-x86-01.stg to replace
  them both.

- added a new kernel02 to replace kernel01 the next onsite trip.
  This machine still needs switch ports configured.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-04-18 12:53:13 -07:00
Kevin Fenzi
c84b99223c osbs: raise a glass for it's service
This removes osbs and allmost all it's associated playbooks and files.

It served long and well, but we no longer need it.
flatpaks are building with a koji-flatpak plugin.
base/minimal/toolbox containers are building with kiwi.
We aren't building any other containers right now, and we did they could
be added to kiwi.

This is the end of an era... I look with nostolga on
ansible-ansible-openshift-ansible (a role to setup ansible on a control
host and run it from our ansible).

Good bye osbs!

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-03-28 12:52:07 -07:00
Kevin Fenzi
63a7ea4b85 autosign01: remove from inventory/monitoring, replaced by autosign02
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2024-01-30 16:47:32 -08:00
David Kirwan
9c3a24e79a zabbix: Zabbix production configuration 2023-11-09 12:55:26 +00:00
Kevin Fenzi
2fd9695820 nagios: remove some no longer existant machines from nagios trying to monitor their mgmt interfaces
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2023-03-07 18:43:30 +00:00
Kevin Fenzi
35c1d99d08 nagios: adjust config to work
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-06-30 15:04:27 -07:00
Kevin Fenzi
8455bd63a0 add sign-vault01/02 and autosign01 mgmt interfaces to monitoring
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-06-30 14:22:18 -07:00
Kevin Fenzi
19d2fbffbf inventory: add some power mgmt interfaces
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-05-26 10:24:12 -07:00
Kevin Fenzi
47ccbd5e1b remove bvmhost-p08-01.stg
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-05-19 14:32:03 -07:00
Kevin Fenzi
b408e1ad64 nagios: update all the openshift4 compute nodes for monitoring
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-21 12:28:43 -08:00
Kevin Fenzi
92945b3a27 nagios: add a bunch more mgmt interfaces
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2022-02-21 11:20:48 -08:00
Kevin Fenzi
580cd252c5 Inventory group/host variables: Sort yaml
This was done using yq (
https://mikefarah.gitbook.io/yq/operators/sort-keys )

Doing things this way makes it much easier to see if a variable is set
in a file or if two hosts differ in what variables they set. Hopefully
we can keep things sorted moving forward.

Basically this means just sort a-z anything you add to any host or group
vaiable and it will be in the right place.

Additionally, this enforces 'normal' intent rules for all the variable
files which we should also try and obey. 2 spaces for first level, 3 for
next, etc. When in doubt you can run yq on it.

This should cause NO actual vairable changes, it's all just readability
fixing for humans, ansible parses it exactly the same.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-11-16 13:27:57 -08:00
Kevin Fenzi
7c15f8e022 adjust bkernel mgmt interface named
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-06-01 18:34:04 -07:00
Kevin Fenzi
3c12ef6aa9 Killed trailing spaces in group/host vars with fire.
Normally it's just a nitpick to not have trailing spaces on variables.
However, for some things like mac address, it really matters.
Bunches of buildhw's were failing ansibile because they were passing
"mac address " to linux-system-roles networking and ansible was going
'huh, nope, I can't find that mac address here at all'.
So, just blow all the tailing spaces away to avoid any other variables
that hit this.

Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-05-04 08:52:52 -07:00
Kevin Fenzi
f3ab68c101 zabbix/staging: try and exclude it from nagios monitoring harder
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2021-04-05 15:36:51 -07:00
Nick Bebout
0eae657232 Fix sudo rules for sysadmin-noc and sysadmin-veteran 2021-03-28 20:46:01 -05:00
Nils Philippsen
77c3daa9b7 ipa/client: enable for nagios in prod
Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-03-24 13:44:33 +01:00
Nils Philippsen
dbbf94a411 ipa/client: configure global shell access and sudo
Almost global anyway, i.e. inside the VPN.

The ipa/client-based shell access and sudo rules are only effective for
staging right now, the respective playbook bits are masked out for prod.

- Assign Ansible host groups to IPA host groups, the latter don't care
  about 'stg' in the name and use dashes rather than underscores.
- Distill shell access groups from fas_client_groups in group and host
  vars.
- Let all `sysadmin-*` groups in the previous list run anything via sudo
  in the host group (except bastion & batcave).
- Remove `fas_client_groups` from staging host and group vars.
- Remove sudoers from staging host and group vars if only `sysadmin-*`
  groups have shell access.
- Set up `ipa_client_shell_groups` on bastion to be a super set of the
  same on batcave.

Newly created IPA host groups:
- autosign
- badges
- basset
- bastion
- batcave
- blockerbugs
- bodhi
- bugzilla2fedmsg
- busgateway
- datagrepper
- dbserver
- dns
- fedimg
- github2fedmsg
- ipa
- kernel-qa
- kerneltest
- kojibuilder
- kojihub
- kojipkgs
- logging
- mailman
- memcached
- mirrormanager
- nagios
- notifs
- oci-registry
- odcs
- openqa
- openqa-workers
- osbs
- packages
- pdc-web
- pkgs
- proxies
- rabbitmq
- releng-compose
- resultsdb
- secondary
- sign-bridge
- sundries
- value
- wiki

Signed-off-by: Nils Philippsen <nils@redhat.com>
2021-02-01 22:23:41 +00:00
Kevin Fenzi
15b7255550 nagios: try and exclude centos_ipa_client_stg group from nagios
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-28 13:00:19 -07:00
Kevin Fenzi
ffac127467 retrace01: retrace01 isn't a thing, we have retrace03
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-10-02 16:42:51 -07:00
Stephen Smoogen
46d5a192aa add qvmhost-x86-01 to inventory 2020-09-03 14:47:53 -04:00
Stephen Smoogen
26b50b6192 remove opengear02 as it is in rdu and it is a long ping if it worked at all 2020-08-13 16:07:54 -04:00
Kevin Fenzi
d6ebdb44de inventory: nagios: adjust bkernel and qvmhost names for mgmt in nagios
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-07-06 14:04:32 -07:00
Stephen Smoogen
28ba173acb move the dns_external check to using a group variable in the nagios group. This takes it out of the main inventory where its names do not match and this other group was not used in any other playbook 2020-07-01 17:40:02 -04:00
Kevin Fenzi
2290817ace inventory: drop more autosign01 and bastion-comm01 rabbitmq: add monitoring plugin now.
Signed-off-by: Kevin Fenzi <kevin@scrye.com>
2020-06-30 17:10:32 -07:00