Page MenuHomeSoftware Heritage

swh-graph in production
Closed, MigratedEdits Locked

Description

We need to have an in production swh-graph service, including fully automated periodic exports from storage.
This is a meta-task to track all the related activities to achieve this goal.

Event Timeline

vlorentz triaged this task as Normal priority.Jan 22 2020, 4:19 PM
bchauvet raised the priority of this task from Normal to High.Mar 25 2022, 5:29 PM

Graph status meeting

  • Interrogation du graph
    • Rocquencourt : Granet 700GO de ram (max atteint)
  • Compression du graph : 1.7TO minimum
    • Telecom : machine 4TO

Compression

TODO

  • Finish GRPC migration (seirl)
    • Forge issues to cleanup once GRPC is merged (seirl)
  • Automate deployment (sysadm) >> prepare the command
  • Native hadoop libraries (?)
    • T4250
    • Benchmark perfs with and without the hadoop librairies
  • Luigi ETL[1] pipeline for compression / deployment
  • Integration of the generated javadoc in swh docs (vlorentz)
  • Integration of Java code coverage in the forge
  • Unit test the compression pipeline

[1] https://luigi.readthedocs.io/en/stable/