Page MenuHomeSoftware Heritage

docs: quickstart: add compression instructions
ClosedPublic

Authored by haltode on Sep 15 2020, 10:41 AM.

Diff Detail

Repository
rDGRPH Compressed graph representation
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D3945 (id=13887)

Rebasing onto eaf0323a1c...

Current branch diff-target is up to date.
Changes applied before test
commit 19ebd60adf25a41b5842282cf07ae634c3a5ca95
Author: Thibault Allançon <haltode@gmail.com>
Date:   Tue Sep 15 10:37:25 2020 +0200

    docs: quickstart: add compression instructions

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/29/ for more details.

docs/quickstart.rst
4–5

I'd say using swh-graph (the API doesn't matter much).

28–31

Please merge this with the previous stuff, i.e., add zstd to the first paragraph and the zstd package name to the apt install line

73

this is also documented in more details in the swh-dataset package, see, e.g., https://docs.softwareheritage.org/devel/swh-dataset/graph/dataset.html , which also lists other datasets (small and large). You might want to sphinx-crosslink that package from here

89–91

I begrudgingly ack this documentation change, because without it things would explose.

But.

At the same time it would be better to find a way to have a sane default (e.g., computed as a proportion of the number of edges, in case it is already known at this stage?) that just works in most cases, rather than one that does now with a recommended configuration tuning for users.

zack requested changes to this revision.Sep 15 2020, 11:02 AM

looks great in general!
just a few nits here and there (and possibly a separate issue to file for the sane default part)

This revision now requires changes to proceed.Sep 15 2020, 11:02 AM
This revision now requires review to proceed.Sep 15 2020, 11:03 AM
docs/quickstart.rst
73

Sure! However the graph compressed with swh-graph are not linked there (and it seems to be missing for the popular-4k dataset), should I add those on the swh-dataset doc page first?

89–91

Agreed, I opened a new task for this: T2595.

Fix wording suggestions from zack.

Build is green

Patch application report for D3945 (id=13894)

Rebasing onto eaf0323a1c...

Current branch diff-target is up to date.
Changes applied before test
commit 1f5199289a4ccef08583359de1e4449d954394d7
Author: Thibault Allançon <haltode@gmail.com>
Date:   Tue Sep 15 10:37:25 2020 +0200

    docs: quickstart: add compression instructions

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/30/ for more details.

This revision is now accepted and ready to land.Sep 15 2020, 1:22 PM

Build is green

Patch application report for D3945 (id=14051)

Rebasing onto a0f17b471c...

Current branch diff-target is up to date.
Changes applied before test
commit bc5614a2c6fad99eab7b3f4d19e813e1f0a297ba
Author: Thibault Allançon <haltode@gmail.com>
Date:   Tue Sep 15 10:37:25 2020 +0200

    docs: quickstart: add compression instructions

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/37/ for more details.