Page MenuHomeSoftware Heritage

Resume provenance content-revision layer processing through the revision journal client
Closed, MigratedEdits Locked


We want to resume processing of revisions in the provenance database using the new journal client.

To do so, we will:

  • give mmca access to the main archive through a SSH reverse tunnel
  • give mmca access to the journal through read-only, non-privileged journal credentials
  • upgrade swh.provenance on mmca
  • restart mmca's content-revision layer using the journal client

Event Timeline

olasd triaged this task as High priority.Jul 22 2022, 11:17 AM
olasd created this task.

I've added an autossh-mmca.service unit on belvedere to have a reverse tunnel from localhost@mmca:5345 -> localhost@belvedere:5432.

I've granted mmca access to the production journal for user swh-provenance-mmca.

On mmca:

  • I've upgraded swh.provenance in a new virtualenv.
  • I've created systemd system units for the provenance revision journal clients on mmca:
swhprovenance@mmca:~$ ls -l /etc/systemd/system/swh-provenance-revision-client*
-rw-r--r-- 1 root root 701 Jul 22 13:48 /etc/systemd/system/swh-provenance-revision-client-debug@.service
-rw-r--r-- 1 root root 671 Jul 22 13:50 /etc/systemd/system/swh-provenance-revision-client@.service

I've started the debug unit for one of the clients, and the non-debug unit for 63 others.

To monitor the progress of the journal clients:

To monitor the provenance progress: is still valid.

The journal client has a bunch of already-processed revisions to skip over/throw away, so the actual provenance metrics will probably start picking back up in a few days.

FWIW I've also started the provenance storage server as a systemd system service now.