Maniphest T1954

Up-to-date objstorage mirror on S3
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	vlorentz
	Aug 19 2019, 11:44 AM

Description

The S3 bucket containing objects is very outdated. We need to keep fill the gap and keep it up to date. This will be done with the content-replayer, that reads new objects from Kafka and then copies them from the object storages on Banco and Uffizi.

To speed up the replay, it will use a 60GB file containing hashes of files that are already on S3. (It's a sorted list of hashes, so the replayer will do random short reads).

The content replayer is single-threaded and has a high latency, so we should run lots of instances of it, at least to fill the initial gap (I tried up to 100 on my desktop, the speedup is linear).

Example systemd unit to run it:

[Unit]
Description=Content replayer Rocq to S3 (service %i)
After=network.target

[Service]
Type=simple
ExecStart=/bin/bash -c 'sleep $(( RANDOM % 60 )); /home/dev/.local/bin/swh --log-level=INFO journal --config-file ~/replay_content_rocq_to_s3.yml content-replay --exclude-sha1-file /srv/softwareheritage/cassandra-test-0/scratch/sorted_inventory.bin'
Restart=on-failure
SyslogIdentifier=content-replayer-%i

Nice=10

[Install]
WantedBy=default.target

(The random sleep at the beginning is to workaround a crash that happens if too many kafka clients start at the same time.)

Example config file:

objstorage_src:
  cls: multiplexer
  args:
    objstorages:
    - cls: filtered
      args:
        storage_conf:
          cls: remote
          args:
            url: http://banco.internal.softwareheritage.org:5003/
        filters_conf:
        - type: readonly
    - cls: filtered
      args:
        storage_conf:
          cls: remote
          args:
            url: http://uffizi.internal.softwareheritage.org:5003/
        filters_conf:
        - type: readonly

objstorage_dst:
  cls: s3
  args:
    container_name: NAME_OF_THE_S3_BUCKET
    key: KEY_OF_THE_S3_USER
    secret: SECRET_OF_THE_S3_USER

journal:
  brokers:
  - esnode1.internal.softwareheritage.org
  - esnode2.internal.softwareheritage.org
  - esnode3.internal.softwareheritage.org
  group_id: vlorentz-test-replay-rocq-to-s3
  max_poll_records: 100

Related Objects
Search...

Status	Assigned	Task
Migrated	gitlab-migration	T3085 Complete and updated copy of the archive on S3 (objects+graph)
Migrated	gitlab-migration	T3477 Add alerting when the copy to S3 starts lagging
Migrated	gitlab-migration	T1954 Up-to-date objstorage mirror on S3
Migrated	gitlab-migration	T2003 Content replayer may try to copy objects before they are available from an objstorage

Event Timeline

vlorentz triaged this task as High priority.Aug 19 2019, 11:44 AM

vlorentz created this task.

vlorentz updated the task description. (Show Details)

vlorentz updated the task description. (Show Details)Aug 19 2019, 11:47 AM

vlorentz updated the task description. (Show Details)Oct 29 2019, 11:34 AM

vlorentz mentioned this in D2223: swh-docs: Add storage sites documentation (v2).Nov 6 2019, 2:06 PM

So I've deployed this (by hand for now) on uffizi and it seems to be doing its job.

Deployment steps:

create an IAM policy in the AWS management console with only read/write access to the softwareheritage/contents bucket
create an IAM account with that policy enabled
get access credentials for that IAM account
notice that the objstorage doesn't implement having contents in a subdirectory; fix that and release the objstorage
retrieve vlorentz's exclude file
tweak the config and the unit file (separate user, proper objstorage config with compression, ...)
systemctl start content-replayer-s3@{01..20}
notice that we process 2 objects per second per client
systemctl start content-replayer-s3@{21..40}
notice that we still process 2 objects per second per client and that the loadavg is < 15
systemctl start content-replayer-s3@{41..60}
notice that we still process 2 objects per second per client and that the loadavg is < 15

I'll probably start more workers if things stay stable.

I've also patched swh.journal by hand to process batches of 1000 objects instead of 20 to reduce log spam. I'm not 100% sure how to handle that properly.

I've added a metric with the S3 objects to https://grafana.softwareheritage.org/d/jScG7g6mk/objstorage-object-counts. There's... "some" work to do still.

So, the amount of contents on S3 went up fairly quickly during in between Nov 10th and Nov 20th, but then it stopped again, is it expected/normal?

(thanks for the metric, it really helps)

In T1954#39027, @zack wrote:

So, the amount of contents on S3 went up fairly quickly during in between Nov 10th and Nov 20th, but then it stopped again, is it expected/normal?

(thanks for the metric, it really helps)

I haven't touched the replayers since last week; They're suffering from T2034 and end up hanging one by one. I'll follow up to the other task once/when a diagnosis happens.

olasd merged a task: T1899: complete object storage mirror on AWS.Dec 7 2019, 6:30 PM

olasd added subscribers: rdicosmo, seirl.

I've grown tired of babysitting this, so I've added systemd notify calls to the journal replayer, allowing us to just use the systemd watchdog to restart hung processes.

I've also bumped up the parallelism to the maximum of 128 (which is the number of partitions on the content topic).

I've updated the exclusion file with data from the 2020-04-19 s3 inventory.

The throughput is now 50% higher than what it used to be.

https://grafana.softwareheritage.org/d/d3l2oqXWz/s3-object-copy?orgId=1&from=now-31d&to=now is a grafana dashboard to monitor the copy.

The s3 object copy is now completely caught up with where kafka was when the backfilling of all objects from postgresql ended. This means we're now copying the "newer" objects, and there's pretty much no hits at all on the inventory file anymore.

I'll leave this going for a while (over the weekend), then I'll remove the inventory file option to see whether we see a change in throughput.

The journal clients copying objects to S3 are blocked on being unable to read messages from kafka.

It seems that rdkafka is unable to decompress the messages because they're too large for its decompression buffer. But even bumping the size up a lot didn't fix that issue. I haven't had time to investigate the specific messages the client has issue with.

This warrants a separate task, because we should fix that message size issue before we start filling up our new kafka cluster.

rdicosmo merged a task: T1914: Keep mirror of contents on S3 up to date.Mar 4 2021, 5:44 PM

rdicosmo mentioned this in T3085: Complete and updated copy of the archive on S3 (objects+graph).Mar 4 2021, 5:50 PM

rdicosmo added a parent task: T3085: Complete and updated copy of the archive on S3 (objects+graph).

The process has been restarted and is well ongoing (we have 800 million objects left to copy, at around 500 ops, so the ETA until reaching the tail of the log is around 3 weeks now).

We've now swapped the order of operations in swh.storage so all objects are written to the object storage (on saam and azure) before kafka gets a reference to the object. This should make the copy process fully work even when reaching the tail of the kafka topics (we're not there yet).

For the last few days, we've been seeing a small uptick in 503 errors from S3 (around 2500 objects failed to copy in the last week). We'll need to investigate these a bit further, and we'll need to process the backlog of errors to copy the missing objects.

Some partitions have reached the tail of the journal and everything is still running smoothly, yay.

https://grafana.softwareheritage.org/d/d3l2oqXWz/s3-object-copy?orgId=1&from=now-2d&to=now

unless I'm mistaken, this task can be closed now, it looks to have reached a steady state where the lag is near 0

We should probably add monitoring alerts (if we don't already have them) before closing the task

well this task should be closed, and a new subtask could be added for the alerting

→ T3477

douardda added a parent task: T3477: Add alerting when the copy to S3 starts lagging.Aug 10 2021, 3:59 PM

olasd closed this task as Resolved.Dec 16 2021, 3:12 PM

This task has been migrated to GitLab.

gitlab-migration changed the status of subtask T2003: Content replayer may try to copy objects before they are available from an objstorage from Resolved to Migrated.Jan 8 2023, 4:28 PM

Up-to-date objstorage mirror on S3Closed, MigratedEdits LockedActions

Description

Related ObjectsSearch...

Event Timeline

Up-to-date objstorage mirror on S3
Closed, MigratedEdits Locked
Actions

Related Objects
Search...