Page MenuHomeSoftware Heritage

Add docker environment documentation
ClosedPublic

Authored by haltode on Jul 4 2019, 3:37 PM.

Diff Detail

Repository
rDGRPH Compressed graph representation
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

zack requested changes to this revision.Jul 4 2019, 4:14 PM
zack added inline comments.
docs/docker.rst
24–25

I find this sentence confusing: it looks like it's expecting the dir to contain hidden files (starting with a .) for nodes and edges.

We should add a preliminary comment like, "given a graph specified as a couple of files called g.edges.csv.gz and g.nodes.csv.gz" (or whatever you like instead of g :-)).

While at it, we should describe the expected syntax and semantics of those two files. The syntax is a gzip-compressed, csv file, using space as separator and the following columns: bla bla bla. The semantics is partly pointing to SWH PIDs and adding to that the fact that the nodes file should contain a sorted list of all the (unique) node identifiers that occur in the edges file.

27

active voice :-), i.e. "To start graph compression:"

32

i'm guessing graph_name here means that the inputs are graph_name.{nodes,edges}.csv.gz, right?

if so, specifying the base file name as mentioned above would clarify this part too

39

We should briefly explain what these mapping files are, and state what their file names will be.

This revision now requires changes to proceed.Jul 4 2019, 4:14 PM
  • Add preliminary section for expected input files
  • Add sub-sections
  • Fix style/grammar
This revision is now accepted and ready to land.Jul 5 2019, 2:38 PM
This revision was automatically updated to reflect the committed changes.