Hardware architecture for the object storage
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	dachary
	May 15 2021, 1:07 PM

Description

Given the benchmark results T3149, what hardware architecture could support the object storage design? (see also T3054 for more context).

Here is a high level description of the minimal hardware setup.

Network

10Gb 16 port switch

Write Storage

If the failure domain is the host, there must be two of each. If the failure domain is the disk, additional disks must be added for RAID5 or RAID6.

1 Global Index: disks == 4TB nvme, nproc == 48, ram == 128GB, network == 10Gb
1 Write ingestion: disks == 6TB nvme, nproc == 64, ram == 256GB, network == 10Gb

The size of the global index uses 125 bytes per entry once ingested in PostgreSQL. Each entry is 32 bytes for the cryptographic signature + 8 bytes for identifier of the shard in which the corresponding object can be found. And there is a unique index created on the cryptographic signature.

Read Storage

3 monitor/orchestrator: disks == 500GB ssd + 4TB storage, nproc == 8, ram == 32GB, network == 10Gb
7 osd: disks == 500GB ssd + 10 x 8TB/12TB, nproc == 16, ram == 128GB, network == 10Gb

Clients

Each is running up to 20 daemons servicing client requests for the Read Storage and the Write Storage.

2 daemons: disks == 500GB ssd + 4TB storage, nproc == 24, ram == 32GB, network == 10Gb

Related Objects

Mentioned Here: T3054: Scale out object storage design
T3149: Benchmark software for the object storage

Event Timeline

dachary changed the task status from Open to Work in Progress.May 15 2021, 1:07 PM

dachary created this task.

dachary created this object in space S1 Public.

dachary added a parent task: T3054: Scale out object storage design.

dachary removed a parent task: T3054: Scale out object storage design.

dachary updated the task description. (Show Details)May 15 2021, 1:11 PM

dachary updated the task description. (Show Details)

dachary triaged this task as Normal priority.May 17 2021, 1:59 PM

dachary updated the task description. (Show Details)

dachary updated the task description. (Show Details)May 17 2021, 2:07 PM

dachary updated the task description. (Show Details)May 17 2021, 2:10 PM

@olasd E. Lacour completed a study for a Ceph cluster today, with hardware specifications and pricing. He is available to discuss if you'd like.

dachary added a comment.May 31 2021, 9:57 AM

This comment was removed by dachary.

In T3327#65595, @dachary wrote:

E. Lacour @ easter-eggs recently finished a study for hardware procurement and the design of a Ceph cluster that is not too far from the minimum that would be required for the Read Storage. He is available to help if needed / possible.

Yeah, I think it would be useful to have a chat, at least to get a set of sensible ballpark figures we can measure our own quotes to (and maybe get an idea of other providers we could get hardware from, if what we're getting isn't satisfactory).

Would you mind setting up a call with Emmanuel, @vsellier and myself this week? (starting Tuesday, I don't have any hard scheduling constraints, for what I'd expect would be a 30 mins call?).

The call is set to Wednesday June 2nd, 2021 4pm UTC+2 at https://meet.jit.si/ApparentStreetsJokeOk

dachary updated the task description. (Show Details)May 31 2021, 5:45 PM

My notes on the meeting:

manu

Remote access
ASINFO provided the necessary specs with our requirements. It would be too difficult for us to navigate the catalogue.
The hard drives: we don't buy them with ASINFO because it is more expensive
Carefull on the SSD and nvme: it matters a lot and the Ceph cluster wears them very quickly (see Intel vs others)
We added HBA cards with cache otherwise it is slower
We have two pools for Ceph
- Journal on SSD + HDD
- Full SSD
It is difficult to find 2.5'' SSD and we find more nvme
Backend and frontend for Ceph are worth separating for debugging (two 10GB cards)
We tried hyperconverged (VM + Ceph) but we had trouble debugging performance problems

olasd

For the PostgreSQL cluster
Dell with 5 years warranty
Two machines with cross replication

Ceph
Machines without warranty
Not necessary Dell, maybe SuperMicro
ASINFO is our provider

We bought disk array from ASINFO and they made us a reasonable deal but we did not thoroughly research
It is 2x cheaper than Dell
I noticed that nvme can be on the same price range as SSD
Regarding the network with have two switchs that do 10GB
I'm not sure if there is a need to aggregate link
We have proxmox based Ceph (hyperconverged)
For the Read Storage we are looking at a 100% dedicated Cluster
Maybe (at a later time) we could use the Ceph cluster for other workloads (if the performances allow that)
We will need to add a rack (3 PDU, 32A each)

dachary updated the task description. (Show Details)Jun 5 2021, 8:21 PM

AMD

Storage node 8TB

Storage node 12TB

Management node

Intel

Storage node 8TB

Storage node 12TB

Management node

Cables

Quote for the write storage nodes.

olasd edited projects, added Object storage (RedHat collaboration); removed Object storage.Nov 2 2021, 4:20 PM

olasd moved this task from Backlog to 2021-W44-45 on the Object storage (RedHat collaboration) board.Nov 3 2021, 4:14 PM

Alternative hardware bill of material:

2x Dell EMC S5224F-ON commutateur, 24x25GbE SFP28, 4x 100GbE QSFP28 ports, IO vers PSU air, 2x PSU, OS10
50x Dell Networking Câble SFP28 vers SFP28, 25GbE, Raccordement direct twinaxial en cuivre passif, 2m, Kit client
25x C2G - Câble Ethernet Cat6 (RJ-45) UTP -Jaune - 3m
1x Dell Networking S3148, L3, 48x 1GbE, 2xCombo, 2x 10GbE SFP+ ports fixe, empillable, IO vers PSU airflow, 1x alimentation
20x Intel Optane SSD DC P4800X Series - Disque SSD - chiffré - 375 Go - 3D Xpoint (O (SSDPED1K375GA01)
2x PowerEdge R640 Server

3x PowerEdge R640 Server

This task has been migrated to GitLab.

Hardware architecture for the object storageClosed, MigratedEdits LockedActions

Description

Network

Write Storage

Read Storage

Clients

Related Objects

Event Timeline

manu

olasd

AMD

Storage node 8TB

Storage node 12TB

Management node

Intel

Storage node 8TB

Storage node 12TB

Management node

Cables

Hardware architecture for the object storage
Closed, MigratedEdits Locked
Actions