Use local hypervisor storage in the loader pods
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	vsellier
	Sep 7 2022, 6:19 PM

Description

The loader pods should use local hypervisor storage to avoid an unnecessary load on ceph when the data is offloaded on the disk during the parsing of big repositories.

Currently, on the current "classical" workers, the temporary file is written in a memory fs and the servers are configured to
swap on a partition hosted on the local storage.

The same mechanism can't be used for the kubernetes infrastructure as the swap must be disabled for the
nodes themselves and the notion of swap doesn't exist in a pod.

Revisions and Commits

rSPRE sysadm-provisioning
	D8479	rSPRE444d013f59ef staging-cluster: Declare the 2d disk on the local storage of uffizi
R260 Helm charts for swh packages
	D8480	R260:3f0e96382e4d loaders: Use an emptydir volume type for /tmp
rSPSITE puppet-swh-site
	D8482	rSPSITE386b3c6a40e3 rancher: use a zfs dataset for /var/lib/kubelet to use the local storage for…

Related Objects
Search...

Status	Assigned	Task
Migrated	gitlab-migration	T4523 Dynamic infrastructure
Migrated	gitlab-migration	T4144 Elastic worker infrastructure
Migrated	gitlab-migration	T4506 Use local hypervisor storage in the loader pods

Event Timeline

vsellier triaged this task as High priority.Sep 7 2022, 6:19 PM

vsellier created this task.

vsellier updated the task description. (Show Details)

vsellier changed the task status from Open to Work in Progress.Sep 14 2022, 11:01 AM

vsellier claimed this task.

vsellier moved this task from Backlog to in-progress on the System administration board.

In order to test the local storage on nodes declared on uffizi, I configured a new scratch storage on this hypervisor.
Following T3707#73522 and https://pve.proxmox.com/wiki/Storage:_LVM_Thin

root@uffizi:~# lvcreate -L200G -n proxmox-scratch vg-louvre
  Logical volume "scratch" created.

root@uffizi:~# lvconvert --type thin-pool /dev/vg-louvre/proxmox-scratch
  Thin pool volume with chunk size 128.00 KiB can address at most 31.62 TiB of data.
  WARNING: Converting vg-louvre/proxmox-scratch to thin pool's data volume with metadata wiping.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert vg-louvre/proxmox-scratch? [y/n]: y
  Converted vg-louvre/proxmox-scratch to thin pool.

Unfortunately, uffizi can't be added on the scratch storage pool with the other nodes (Datacenter / Storage / Scratch -> Edit)
because the lvm configuration is not the same (no dedicated scratch vg) so I finally created a new scratch volume with only
uffizi inside.
The thin vg was not detected in the interface so I manually declared the storage in in /etc/pve/storage.cfg:

root@uffizi:/etc/pve# diff -U3 ~/storage.cfg storage.cfg
--- /root/storage.cfg	2022-09-14 11:03:55.444277955 +0000
+++ storage.cfg	2022-09-14 11:05:03.000000000 +0000
@@ -20,3 +20,9 @@
 	content images,rootdir
 	nodes hypervisor3,pompidou,branly

+lvmthin: uffizi-scratch
+	thinpool proxmox-scratch
+	vgname vg-louvre
+	content images,rootdir
+	nodes uffizi
+

and everything was fine.

Uffizi has 200G allocated lv allocated to the unused local storage so if we need more space, we can reduce the size of this logical volume

rancher seems to create emptydir volume in /var/lib/kubelet, except the /var/lib/kubelet/pki directory, everything is ephemeral in this directory so we could easily use a partition backed by a local storage disk.
It will also remove an unecessary pressure on ceph for the pod relative data.
The /var/lib/docker directory could also be moved to this local partition as everything in docker can be lost.
I will manually try that on one staging node to check if it can work before changing the terraform / puppet code

It works \o/

on rancher-node-staging-worker2,

move the second disk to the uffizi local storage pool (uffizi-scratch)
create a new zfs dataset on the data pool

root@rancher-node-staging-worker2:/var/lib# mv kubelet kubelete-save
root@rancher-node-staging-worker2:/var/lib# zfs create -o mountpoint=/var/lib/kubelet -o atime=off -o relatime=on -o compression=zstd data/kubelet

restart rancher and uncordon the node
create a container with a emptydir volume mounted on /tmp (P1453)
generate some write load on the /tmp directory of the container
the write activity is visible on uffizi (and not on the network as before)

vsellier added a revision: D8479: staging-cluster: Declare the 2d disk on the local storage of uffizi.Sep 14 2022, 6:40 PM

vsellier added a revision: D8480: loaders: Use an emptydir volume type for /tmp.Sep 14 2022, 7:13 PM

vsellier added a revision: D8482: rancher: use a zfs dataset for /var/lib/kubelet to use the local storage for EmptyDir volumes.Sep 15 2022, 10:18 AM

The kubelet dataset will need to be manually created on all the rancher nodes (except staging worker2 and worker3 already configured) before applying D8482

cluster-argo
archive-staging
archive-production

vsellier added a commit: rSPRE444d013f59ef: staging-cluster: Declare the 2d disk on the local storage of uffizi.Sep 15 2022, 2:28 PM

vsellier added a commit: R260:3f0e96382e4d: loaders: Use an emptydir volume type for /tmp.Sep 15 2022, 3:00 PM

Example during the loading of https://github.com/torvalds/linux by a pod:

 % /usr/sbin/zfs list data/docker data/kubelet
NAME           USED  AVAIL     REFER  MOUNTPOINT
data/docker   3.81G  40.4G     83.2M  /var/lib/docker
data/kubelet  3.71G  40.4G     3.71G  /var/lib/kubelet

The compression is not as useful as for docker

 % /usr/sbin/zfs get compressratio data/kubelet data/docker 
NAME          PROPERTY       VALUE  SOURCE
data/docker   compressratio  2.95x  -
data/kubelet  compressratio  1.07x  -

vsellier closed this task as Resolved.Sep 15 2022, 5:25 PM

vsellier moved this task from in-progress to done on the System administration board.

vsellier added a commit: rSPSITE386b3c6a40e3: rancher: use a zfs dataset for /var/lib/kubelet to use the local storage for….Sep 15 2022, 5:28 PM

This task has been migrated to GitLab.

Use local hypervisor storage in the loader podsClosed, MigratedEdits LockedActions

Description

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Use local hypervisor storage in the loader pods
Closed, MigratedEdits Locked
Actions

Related Objects
Search...