Benchmark objstorage for mirror (uffizi vs. azure vs. s3)
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	douardda
	Oct 15 2020, 12:36 PM

Description

For the mirrors, we need to have some idea of how fast each objstorage allows to read objects from.

Related Objects

Mentioned In: T3049: Benchmarking an RBD based object container
T1577: Compare/benchmark objstorage backends
Mentioned Here: P820 bench objstorage

Event Timeline

douardda triaged this task as High priority.Oct 15 2020, 12:36 PM

douardda created this task.

Current benchmarck scenario:

build a list of 1M sha1 (extracted from storage's content table)
retrieve these 1M objects from each objstorage from a close machine (azure vm for azure, rocq machine for uffizi, ec2 machine for s3) and measure the time it takes.
retrieve these 1M objects from each objstorage from a distant machine (if possible) and measure the time it takes.

The script used to perform the bench is P820, using a few probing runs (limited to 30s) with different number of workers and threads to find a sweet spot.

zack added projects: Mirror, Object storage.Oct 15 2020, 12:44 PM

Some results:

Notes

tests have been run while ingestion is on pause, so no write load on uffizi.
granet was been under serious load (swh-graph workload) during these tests.
errors on uffizi are missing objects (on banco?)

	from	workers	threads	object rate	volume rate	errors
azure	boatbucket (azure)	64	8	3201 obj/s	199 MB/s	0
s3	boatbucket (azure)	64	16	1569 obj/s	97 MB/s	9
s3	granet (rocq)	64	16	2329 obj/s	144 MB/s	7
uffizi	granet (rocq)	16	8	1137 obj/s	60 MB/s	157521
uffizi	granet (rocq)	32	8	1929 obj/s	101 MB/s	157521

Command used:

uffizi: python bench_reqrep.py -w 16 -c 8 -f content_sha1 http://uffizi:5003/
azure: python bench_reqrep.py -l ERROR -f content_sha1 -w64 -c8 azure.yml
s3: python bench_reqrep.py -f content_sha1 -w64 -c8 s3.yml

content_sha1 is the result of a SELECT sha1 FROM content order by sha256 LIMIT 1000000.

Since the results on uffizi above did suffer from a few caveats, I've made a few more tests:

a first result has been obtained with a dataset that had only objects stored on the XFS part of the objstorage
a second dataset has been created (with the order by sha256 part to spread the sha1s)
but results are a mix hot/cold cache tests

Made a new dataset using:

select sha1 from content where sha256 > '\x000729010ac682fa942e4bfedb2366da310ca438c1677ef0812dbb53c42bcea2' order by sha256 limit 100000 \g content_sha1_block2

The given sha256 is the last one of the fist dataset.
It's a smaller one (100k) to just get rough numbers for now. A 1M test case will follow.

I've run this test on XFS and ZFS separately, using local objstorage configs instead of using the RPC server running on uffizi (so these tests have been executed on uffizi itself).

Here, cold means the sha1s have not been retrieved in a previous run (fresh dataset), whereas hot means the same test has been executed a second time immediately.

FS	cache	workers	threads	total obj. rate (obj/s)	ok obj. rate (obj/s)	volume rate (MB/s)	missing objs
XFS	cold	32	16	1571	871	72	44584
XFS	hot	32	16	11118	6164	513	44584
ZFS	cold	32	16	1036	327	24	68557
ZFS	hot	32	16	10662	3355	254	68557

For the sake of completness, objstorage config files are:

---
objstorage:
  cls: pathslicing
  args:
    root: "/srv/softwareheritage/objects-xfs"
    slicing: 0:1/0:2/2:4/4:6
    compression: gzip
client_max_size: 1073741824

and

---
objstorage:
  cls: pathslicing
  args:
    root: "/srv/softwareheritage/objects"
    slicing: 0:2/0:5
    compression: none
client_max_size: 1073741824

vsellier added a subscriber: vsellier.Oct 16 2020, 12:19 PM

Same as before but with 1M (fresh) sha1s:

FS	cache	workers	threads	total obj. rate (obj/s)	ok obj. rate (obj/s)	volume rate (MB/s)	missing objs	handbrake
ZFS	cold	32	16	1207	379	32	686110	on
ZFS	hot	32	16	2491	783	66	686110	on
ZFS	cold	32	16	1250	393	31	685130	on
ZFS	hot	32	16	2613	824	65	685130	on
ZFS	cold	32	16	1569	492	40	686341	off
ZFS	hot	32	16	4057	1275	104	686341	off
XFS	cold	32	16	1665	924	69	445410	off
XFS	hot	32	16	11342	6302	471	445410	off

FTR, "handbrake" means atime on the fs...

Note: the XFS has cache=data enabled whereas ZFS only have primarycache=metadata; might explain the big differences between these 2 on the hot cache test case.

douardda mentioned this in T1577: Compare/benchmark objstorage backends .Oct 26 2020, 12:30 PM

dachary added a subscriber: dachary.Feb 14 2021, 9:47 PM

dachary mentioned this in T3049: Benchmarking an RBD based object container.Feb 15 2021, 6:16 PM

This task has been migrated to GitLab.

Benchmark objstorage for mirror (uffizi vs. azure vs. s3)Closed, MigratedEdits LockedActions

Description

Related Objects

Event Timeline

Benchmark objstorage for mirror (uffizi vs. azure vs. s3)
Closed, MigratedEdits Locked
Actions