Page MenuHomeSoftware Heritage

[staging] kafka data dir over 80%
Closed, MigratedEdits Locked

Description

root@journal0:/tmp# df -h /srv/kafka/logdir
Filesystem      Size  Used Avail Use% Mounted on
kafka-volume    481G  409G   73G  85% /srv/kafka/logdir

Event Timeline

vsellier changed the task status from Open to Work in Progress.Dec 17 2020, 9:58 AM
vsellier triaged this task as Normal priority.
vsellier created this task.

After one week, the disk used by kafka was around 85% of usage

root@journal0:/tmp# df -h /srv/kafka/logdir
Filesystem      Size  Used Avail Use% Mounted on
kafka-volume    481G  409G   73G  85% /srv/kafka/logdir

Compared to the production, the compression was not activated on the zfs pool:

root@kafka1:~#  zfs get all data/kafka  | grep compress
data/kafka  compressratio         1.55x                  -
data/kafka  compression           lz4                    inherited from data
data/kafka  refcompressratio      1.55x                  -
root@journal0:/tmp# zfs get all  | grep compress
kafka-volume  compressratio         1.00x                  -
kafka-volume  compression           off                    default
kafka-volume  refcompressratio      1.00x                  -

So the compression was activated :

root@journal0:/tmp# zfs set compression=lz4 kafka-volume
root@journal0:/tmp# zfs get all  | grep compress
kafka-volume  compressratio         1.00x                  -
kafka-volume  compression           lz4                    local
kafka-volume  refcompressratio      1.00x                  -

As this parameter is only used for the new written data, we have force a compact on the biggest topics : `directory, revision and content`

 % ./kafka-topics.sh --zookeeper $ZK  --alter --topic swh.journal.objects.revision --config min.cleanable.dirty.ratio=0.01
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
         Going forward, please use kafka-configs.sh for this functionality
Updated config for topic swh.journal.objects.revision.
vsellier@journal0 /opt/kafka/bin
 % ./kafka-topics.sh --zookeeper $ZK  --alter --topic swh.journal.objects_privileged.revision --config min.cleanable.dirty.ratio=0.01
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
         Going forward, please use kafka-configs.sh for this functionality
Updated config for topic swh.journal.objects_privileged.revision.

 % ./kafka-topics.sh --zookeeper $ZK  --alter --topic swh.journal.objects..directory --config min.cleanable.dirty.ratio=0.01
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
         Going forward, please use kafka-configs.sh for this functionality
Updated config for topic swh.journal.objects.directory

 % ./kafka-topics.sh --zookeeper $ZK  --alter --topic swh.journal.objects..content --config min.cleanable.dirty.ratio=0.01
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
         Going forward, please use kafka-configs.sh for this functionality
Updated config for topic swh.journal.objects.content

After the compact:

root@journal0:~# df -h /srv/kafka/logdir
Filesystem      Size  Used Avail Use% Mounted on
kafka-volume    481G  187G  295G  39% /srv/kafka/logdir

root@journal0:~# zfs get all  | grep compress
kafka-volume  compressratio         2.21x                  -
kafka-volume  compression           lz4                    local
kafka-volume  refcompressratio      2.21x                  -

The configuration to compact the topics with a small ratio is left in place for the moment.