HOW TO SETUP A NEW HYPERVISOR ============================= First make sure you understand how resalloc configuration works (pools.yaml contents and format), and that the referenced scripts (like 'libvirt-new' and 'vm-delete') can correctly identify the pool/hypervisor (those scripts likely need to be updated when you are adding new hypervisor). Especially check that the scripts can assign (and later remove) appropriate IPv6 addresses to builders, work with swap, etc. Before the groups/copr-hypervisor.yml is run against the new host, manually prepare the disk layout. Typically there are many disks we might be tempted to place everything into raid6, but this would be suboptimal! We do something like this instead (on each of those disks): sdd 8:48 0 279.4G 0 disk ├─sdd1 8:49 0 8M 0 part ├─sdd2 8:50 0 1G 0 part │ └─md0 9:0 0 1022M 0 raid1 /boot ├─sdd3 8:51 0 40G 0 part [SWAP] ├─sdd4 8:52 0 1K 0 part ├─sdd5 8:53 0 10G 0 part │ └─md127 9:127 0 60G 0 raid6 │ └─vg_server-root 253:0 0 32G 0 lvm / └─sdd6 8:54 0 228.4G 0 part └─md126 9:126 0 1.8T 0 raid0 └─vg_guests-images 253:1 0 1.8T 0 lvm /libvirt-images The layout description: 1. I.e. /boot (or efi) partition is on raid1 spread across all (even 10) the disks. 2. There's SWAP partition, namely because if there are multiple SWAP partitions, with the same priority mount option, kernel spreads the swap use among all the partitions uniformly. 3. There's / partition, this can be on raid6 because we don't write to that partition that often, and some redundancy is needed to not loose the machine upon a disk failure. We create `vg_server-root` LV on the raid6 to ease potential future data movements to different disks. 4. There's the large 'raid0' which hosts `vg_guests-images` LV, which is mounted on /libvirt-images for use by libvirt. This is a read-write intensive place, so raid0 helps us to speed things up. Manually create the br0 interface: TODO: Not done by @praiskup but the infra team, we need a HOWTO! General VM requirements for builder VMs --------------------------------------- - 2xvCPU+ - 4GB+ RAM - 16G for built results => /var/lib/copr-rpmbuild - 4G for the root / partition - SWAP+RAM - 32GB tmpfs mountpoint for /var/cache/mock - 140GB+ SWAP for tmpfs for mock root(s) => /var/lib/mock AMD hypervisors =============== vmhost-x86-copr0[1-4].rdu-cc.fedoraproject.org - 256G RAM - 64 threads (2x16 cores, AMD EPYC 7302 16-Core Processor) This brings us 32+ (few devel) overcommitted builders on each builder: - 32*16G = up to 512 GRAM (256 RAM + 300 G SWAP) - 4G+16G = 20G*32 = up to 640G for root+results - 172G*32 = up to 5.5T SWAP for tmpfs (huge overcommit, though only limitted amount of build tasks require that much tmpfs) Power8 hypervisors ================== - 128G RAM Power9 hypervisor ================= vmhost-p09-copr01.rdu-cc.fedoraproject.org - storage: ├─vg_guests-LogVol00 253:0 0 32G 0 lvm / ├─vg_guests-swap 253:1 0 300G 0 lvm [SWAP] └─vg_guests-images 253:2 0 1002.6G 0 lvm /libvirt-images - 256G RAM - 160 threads - Thread(s) per core: 4 Core(s) per socket: 20 Socket(s): 2 - Let's take the risk, and overload by 32 + 1 stage. So each of them has only about 32G storage. Overall allocation: 33*16G RAM => up to 528G used (256RAM + 300G SWAP can hold this) 33*140G disk => 4TB+ ... about 400% overcommit in the worst case 33*5 CPU = 165CPU used in the worst case