Page MenuHomeSoftware Heritage

Benchmark objstorage for mirror (uffizi vs. azure vs. s3)
Closed, MigratedEdits Locked

Description

For the mirrors, we need to have some idea of how fast each objstorage allows to read objects from.

Event Timeline

douardda created this task.

Current benchmarck scenario:

  • build a list of 1M sha1 (extracted from storage's content table)
  • retrieve these 1M objects from each objstorage from a close machine (azure vm for azure, rocq machine for uffizi, ec2 machine for s3) and measure the time it takes.
  • retrieve these 1M objects from each objstorage from a distant machine (if possible) and measure the time it takes.

The script used to perform the bench is P820, using a few probing runs (limited to 30s) with different number of workers and threads to find a sweet spot.

Some results:

Notes

  • tests have been run while ingestion is on pause, so no write load on uffizi.
  • granet was been under serious load (swh-graph workload) during these tests.
  • errors on uffizi are missing objects (on banco?)
fromworkersthreadsobject ratevolume rateerrors
azureboatbucket (azure)6483201 obj/s199 MB/s0
s3boatbucket (azure)64161569 obj/s97 MB/s9
s3granet (rocq)64162329 obj/s144 MB/s7
uffizigranet (rocq)1681137 obj/s60 MB/s157521
uffizigranet (rocq)3281929 obj/s101 MB/s157521

Command used:

  • uffizi: python bench_reqrep.py -w 16 -c 8 -f content_sha1 http://uffizi:5003/
  • azure: python bench_reqrep.py -l ERROR -f content_sha1 -w64 -c8 azure.yml
  • s3: python bench_reqrep.py -f content_sha1 -w64 -c8 s3.yml

content_sha1 is the result of a SELECT sha1 FROM content order by sha256 LIMIT 1000000.

Since the results on uffizi above did suffer from a few caveats, I've made a few more tests:

  • a first result has been obtained with a dataset that had only objects stored on the XFS part of the objstorage
  • a second dataset has been created (with the order by sha256 part to spread the sha1s)
  • but results are a mix hot/cold cache tests

Made a new dataset using:

select sha1 from content where sha256 > '\x000729010ac682fa942e4bfedb2366da310ca438c1677ef0812dbb53c42bcea2' order by sha256 limit 100000 \g content_sha1_block2

The given sha256 is the last one of the fist dataset.
It's a smaller one (100k) to just get rough numbers for now. A 1M test case will follow.

I've run this test on XFS and ZFS separately, using local objstorage configs instead of using the RPC server running on uffizi (so these tests have been executed on uffizi itself).

Here, cold means the sha1s have not been retrieved in a previous run (fresh dataset), whereas hot means the same test has been executed a second time immediately.

FScacheworkersthreadstotal obj. rate (obj/s)ok obj. rate (obj/s)volume rate (MB/s)missing objs
XFScold321615718717244584
XFShot321611118616451344584
ZFScold321610363272468557
ZFShot321610662335525468557

For the sake of completness, objstorage config files are:

---
objstorage:
  cls: pathslicing
  args:
    root: "/srv/softwareheritage/objects-xfs"
    slicing: 0:1/0:2/2:4/4:6
    compression: gzip
client_max_size: 1073741824

and

---
objstorage:
  cls: pathslicing
  args:
    root: "/srv/softwareheritage/objects"
    slicing: 0:2/0:5
    compression: none
client_max_size: 1073741824

Same as before but with 1M (fresh) sha1s:

FScacheworkersthreadstotal obj. rate (obj/s)ok obj. rate (obj/s)volume rate (MB/s)missing objshandbrake
ZFScold3216120737932686110on
ZFShot3216249178366686110on
ZFScold3216125039331685130on
ZFShot3216261382465685130on
ZFScold3216156949240686341off
ZFShot321640571275104686341off
XFScold3216166592469445410off
XFShot3216113426302471445410off

FTR, "handbrake" means atime on the fs...

Note: the XFS has cache=data enabled whereas ZFS only have primarycache=metadata; might explain the big differences between these 2 on the hot cache test case.