Files
fedora-infra_ansible/roles/copr/hypervisor

HOW TO SETUP A NEW HYPERVISOR
=============================

First make sure you understand how resalloc configuration works
(pools.yaml contents and format), and that the referenced scripts (like
'libvirt-new' and 'vm-delete') can correctly identify the pool/hypervisor
(those scripts likely need to be updated when you are adding new
hypervisor).  Especially check that the scripts can assign (and later
remove) appropriate IPv6 addresses to builders, work with swap, etc.

Before the groups/copr-hypervisor.yml is run against the new host,
manually prepare the disk layout.  Typically there are many disks we might
be tempted to place everything into raid6, but this would be suboptimal!
We do something like this instead (on each of those disks):

sdd                      8:48   0 279.4G  0 disk
├─sdd1                   8:49   0     8M  0 part
├─sdd2                   8:50   0     1G  0 part
│ └─md0                  9:0    0  1022M  0 raid1 /boot
├─sdd3                   8:51   0    40G  0 part  [SWAP]
├─sdd4                   8:52   0     1K  0 part
├─sdd5                   8:53   0    10G  0 part
│ └─md127                9:127  0    60G  0 raid6
│   └─vg_server-root   253:0    0    32G  0 lvm   /
└─sdd6                   8:54   0 228.4G  0 part
  └─md126                9:126  0   1.8T  0 raid0
    └─vg_guests-images 253:1    0   1.8T  0 lvm   /libvirt-images

The layout description:

  1. I.e. /boot (or efi) partition is on raid1 spread across all (even 10)
     the disks.
  2. There's SWAP partition, namely because if there are multiple SWAP
     partitions, with the same priority mount option, kernel spreads the
     swap use among all the partitions uniformly.
  3. There's / partition, this can be on raid6 because we don't write to
     that partition that often, and some redundancy is needed to not loose
     the machine upon a disk failure.  We create `vg_server-root` LV on
     the raid6 to ease potential future data movements to different disks.
  4. There's the large 'raid0' which hosts `vg_guests-images` LV, which is
     mounted on /libvirt-images for use by libvirt.  This is a read-write
     intensive place, so raid0 helps us to speed things up.

Manually create the br0 interface:

  TODO: Not done by @praiskup but the infra team, we need a HOWTO!


General VM requirements for builder VMs
---------------------------------------

- 2xvCPU+
- 4GB+ RAM
- 16G for built results => /var/lib/copr-rpmbuild
-  4G for the root / partition
- SWAP+RAM
  - 32GB tmpfs mountpoint for /var/cache/mock
  - 140GB+ SWAP for tmpfs for mock root(s) => /var/lib/mock


AMD hypervisors
===============

vmhost-x86-copr0[1-4].rdu-cc.fedoraproject.org

- 256G RAM
- 64 threads (2x16 cores, AMD EPYC 7302 16-Core Processor)

This brings us 32+ (few devel) overcommitted builders on each builder:

 - 32*16G = up to 512 GRAM (256 RAM + 300 G SWAP)
 - 4G+16G = 20G*32 = up to 640G for root+results
 - 172G*32 = up to 5.5T SWAP for tmpfs (huge overcommit, though only limitted
   amount of build tasks require that much tmpfs)


Power8 hypervisors
==================

- 128G RAM


Power9 hypervisor
=================

vmhost-p09-copr01.rdu-cc.fedoraproject.org

- storage:
    ├─vg_guests-LogVol00 253:0    0     32G  0 lvm   /
    ├─vg_guests-swap     253:1    0    300G  0 lvm   [SWAP]
    └─vg_guests-images   253:2    0 1002.6G  0 lvm   /libvirt-images

- 256G RAM

- 160 threads
    - Thread(s) per core:  4
      Core(s) per socket:  20
      Socket(s):           2

- Let's take the risk, and overload by 32 + 1 stage.  So each of them has only
  about 32G storage.

Overall allocation:
    33*16G RAM => up to 528G used (256RAM + 300G SWAP can hold this)
    33*140G disk => 4TB+ ... about 400% overcommit in the worst case
    33*5 CPU = 165CPU used in the worst case