Page MenuHomeSoftware Heritage

zack (Stefano Zacchiroli)
UserAdministrator

User Details

User Since
Sep 7 2015, 3:43 PM (205 w, 5 d)
Roles
Administrator

Recent Activity

Fri, Aug 16

zack accepted D1856: Benchmark: add median value in stats.
Fri, Aug 16, 6:21 PM
zack accepted D1855: Benchmark: output raw datapoints in CSV log file.
Fri, Aug 16, 6:06 PM
zack added inline comments to D1855: Benchmark: output raw datapoints in CSV log file.
Fri, Aug 16, 2:00 PM
zack requested changes to D1855: Benchmark: output raw datapoints in CSV log file.
Fri, Aug 16, 1:57 PM

Thu, Aug 15

zack accepted D1854: Use hash map instead of LongBigArray to backtrack.
Thu, Aug 15, 6:41 PM

Wed, Aug 14

zack accepted D1852: Add proper CLI args management in benchmarking tools.
Wed, Aug 14, 9:37 PM
zack added inline comments to D1852: Add proper CLI args management in benchmarking tools.
Wed, Aug 14, 9:36 PM
zack requested changes to D1852: Add proper CLI args management in benchmarking tools.
Wed, Aug 14, 9:25 PM
zack requested changes to D1852: Add proper CLI args management in benchmarking tools.
Wed, Aug 14, 4:59 PM
zack accepted D1851: Add provenance use-cases benchmarks.
Wed, Aug 14, 2:08 PM
zack accepted D1850: Add input wrapper class for endpoints methods.
Wed, Aug 14, 2:02 PM
zack accepted D1849: Add vault use-case benchmark.

I'm approving this as it's good enough. But please make the seed a real random seed rather than hard coded (in a subsequent commit). It's *useful* to have a fixed seed for reproducibility, but it should not be the default and it should be possible to pass it externally, e.g., as a class parameter and/or CLI option.

Wed, Aug 14, 11:50 AM

Tue, Aug 13

zack accepted D1846: Add browsing use-cases benchmarks.
Tue, Aug 13, 2:48 PM
zack accepted D1843: Make the cypress build fail if any of the commands fails..
Tue, Aug 13, 2:08 PM
zack added inline comments to D1846: Add browsing use-cases benchmarks.
Tue, Aug 13, 10:50 AM
zack requested changes to D1846: Add browsing use-cases benchmarks.
Tue, Aug 13, 8:28 AM

Sat, Aug 10

zack renamed T1950: Reduce RAM usage for generating mapping files from Implement mapping files dumping with less RAM usage to Reduce RAM usage for generating mapping files.
Sat, Aug 10, 3:45 PM · Graph service

Fri, Aug 9

zack accepted D1832: Endpoints now return timings instead of logging them.
Fri, Aug 9, 10:03 AM

Thu, Aug 8

zack removed a member for Staff: haltode.
Thu, Aug 8, 4:50 PM
zack added a comment to D1832: Endpoints now return timings instead of logging them.
  • we should wrap the JSON return type more properly. Please use "result" instead of "content", everything else (like "timings" now, possibly other stuff in the future) will be metadata about the result

True, I will also rename "timings" into "metadata" because I don't like having both "Timing" and "Timings" class name.

Thu, Aug 8, 4:43 PM
zack requested changes to D1832: Endpoints now return timings instead of logging them.

I hadn't in mind to lift the timings up to the REST layer, but why not, it will enable doing interesting stuff using non Java clients. However, a couple of change requests:

Thu, Aug 8, 4:26 PM
zack renamed T1944: use a compact, binary format for node ids mapping files from More compact format for node ids mapping files to use a compact, binary format for node ids mapping files.
Thu, Aug 8, 1:00 PM · Graph service
zack added a comment to T1234: Allow simple read-only connections to db from swh nodes.

In the end the PGPASSFILE stuff cannot work with one file globally defined.
Because for some reason, the authors decided that it needs to be only user readable...

Thu, Aug 8, 12:59 PM · System administration

Wed, Aug 7

zack updated subscribers of T1942: Create shorter url redirecting to "Save code now".

Just wanted to mention here that @douardda and @olasd discussed getting a short URL style domain (I think it was swh.li, but I'm not entirety sure) for various shortening needs).

Wed, Aug 7, 2:17 PM · Website
zack added a comment to T1234: Allow simple read-only connections to db from swh nodes.

This is great! Thanks a lot.

Wed, Aug 7, 9:05 AM · System administration

Tue, Aug 6

zack accepted D1822: Add detailed internal explanations in javadoc.
Tue, Aug 6, 6:35 PM
zack accepted D1821: Add javadoc generation in swh-graph docs assets.
Tue, Aug 6, 10:41 AM
zack requested changes to D1821: Add javadoc generation in swh-graph docs assets.

Can you also add a clean-javadoc target, equivalent of clean-images ?

Tue, Aug 6, 10:08 AM

Mon, Aug 5

zack accepted D1817: Add contextual info to compression script.
Mon, Aug 5, 3:36 PM
zack requested changes to D1817: Add contextual info to compression script.
  • make the echo invocations a function, e.g., info()
  • use a common, but distinguishable prefix instead of just "#" which will allow to search for progress info in the potentially huge log-files, e.g., "* swh-graph:"
  • maybe add a step number v. total number of steps (e.g., 2/5)
Mon, Aug 5, 3:26 PM
zack added a project to T1943: Publish swh-graph to PyPI: Graph service.

Sure, let's do that, but I've never touched anything related to SWH on PyPI, so I've no idea how to make it happens.
Whoever has an idea of how do to that, please just go ahead. (And feel free to tag any recent version for PyPI publishing; the currently tagged version was completely arbitrary, just because a tag was needed for $something.)

Mon, Aug 5, 1:07 PM · Graph service
zack committed rDDOCf46f42f25943: sort requirements files for ease of inspection (authored by zack).
sort requirements files for ease of inspection
Mon, Aug 5, 1:04 PM
zack closed D1783: sort requirements files for ease of inspection.
Mon, Aug 5, 1:04 PM

Sat, Aug 3

zack added a comment to T1884: python bindings for compressed graph access.

No objection from me on adding a more abstract node type. It would be a nicer API and, given it's gonna be on the python side only, it won't have any impact on perf.

Sat, Aug 3, 8:30 AM · Graph service

Thu, Aug 1

zack accepted D1793: docs: fix formatting issues.
Thu, Aug 1, 9:52 AM

Wed, Jul 31

Herald added a reviewer for D1783: sort requirements files for ease of inspection: Reviewers.
Wed, Jul 31, 11:24 AM
zack committed rDDOC0bf985b16688: requirements: add swh-graph to dev reqs too (authored by zack).
requirements: add swh-graph to dev reqs too
Wed, Jul 31, 11:21 AM
zack committed rDDOCfc58f8b0d597: requirements: add swh-graph (authored by zack).
requirements: add swh-graph
Wed, Jul 31, 11:20 AM
zack committed rDDOC1976b384a6b2: remove archiver (now gone) from index and glossary (authored by zack).
remove archiver (now gone) from index and glossary
Wed, Jul 31, 11:19 AM
zack committed rDGRPH83c6bdb8d93c: docs/index.rst: write intro and link existing/dangling documents (authored by zack).
docs/index.rst: write intro and link existing/dangling documents
Wed, Jul 31, 10:57 AM
zack committed rDDOCf04c6a88827e: link swh-graph from the doc index (authored by zack).
link swh-graph from the doc index
Wed, Jul 31, 10:25 AM

Tue, Jul 30

zack committed rDGRPHc883919c931d: dockerfile: add back /usr/bin/time invocation for resource monitoring (authored by zack).
dockerfile: add back /usr/bin/time invocation for resource monitoring
Tue, Jul 30, 10:52 PM
zack added a comment to D1768: docker setup: simplify and uniform to SWH path conventions.

The mapping has always been a separate command but we can indeed add it to the generate_graph.sh script.

Tue, Jul 30, 5:57 PM
zack closed D1768: docker setup: simplify and uniform to SWH path conventions.

landed in a043b0ee04aae50f8c26f6a06aac1e6c9247340a

Tue, Jul 30, 5:56 PM
zack committed rDGRPH0d22852c1f4e: docker doc: drop the list of files generated by Setup class (authored by zack).
docker doc: drop the list of files generated by Setup class
Tue, Jul 30, 5:54 PM
zack committed rDGRPH325ff999deb6: docker doc: add --publish to the run invocation (authored by zack).
docker doc: add --publish to the run invocation
Tue, Jul 30, 5:54 PM
zack committed rDGRPH6212957e8f14: docker doc: further shorten CLI examples using relative paths (authored by zack).
docker doc: further shorten CLI examples using relative paths
Tue, Jul 30, 5:54 PM
zack committed rDGRPHa043b0ee04aa: tests: update generate_graph.sh to match new docker layout (authored by zack).
tests: update generate_graph.sh to match new docker layout
Tue, Jul 30, 5:54 PM
zack committed rDGRPHa428cd8c341f: docker setup: simplify and uniform to SWH path conventions (authored by zack).
docker setup: simplify and uniform to SWH path conventions
Tue, Jul 30, 5:54 PM
zack added a comment to D1768: docker setup: simplify and uniform to SWH path conventions.

Note that you also need to update the java/server/src/test/dataset/generate_graph.sh script since the Docker environment has changed.

Tue, Jul 30, 3:37 PM
zack updated the diff for D1768: docker setup: simplify and uniform to SWH path conventions.
  • tests: update generate_graph.sh to match new docker layout
Tue, Jul 30, 3:32 PM

Mon, Jul 29

zack added a comment to T1920: graph service: add tests for the python client.

We want to test that the client part of a complete client<->server interaction works properly. The best way to do that is, in fact, to rerun the same tests we run on the server side, but via the Python client. If there is an easy way to just reuse the same test code (e.g., by generating Python tests from the Java ones, or vice-versa), go for it. But probably it isn't worth it, as there isn't much test code anyway. If there is no way to keep parity, we should go for something minimal on the Python side, e.g., just test one call per API endpoint, and keep a more complete coverage on the Java side (again: or vice-versa, if we prefer to maintain the Python test code base than the Java one).

Mon, Jul 29, 10:03 PM · Graph service
zack planned changes to D1768: docker setup: simplify and uniform to SWH path conventions.

Tnx, will do.

Mon, Jul 29, 4:58 PM
zack abandoned D1780: REST API doc: drop heading /graph.

No, you're right, I didn't think of the global namespace of unified documentation.

Mon, Jul 29, 4:57 PM
zack accepted D1782: Remove visit/ endpoint and keep only: visit/nodes visit/paths.
Mon, Jul 29, 4:55 PM
zack accepted D1781: Add 'origin' node type.
Mon, Jul 29, 12:44 PM
zack accepted D1753: Bypass edge restriction checks when edges=*.
Mon, Jul 29, 10:34 AM
zack accepted D1755: Add logging of endpoint timing.
Mon, Jul 29, 9:45 AM

Sun, Jul 28

zack triaged T1938: swh-graph: NullPointerException upon (wrong) /walk from cnt to snp as Normal priority.
Sun, Jul 28, 7:22 PM · Graph service
zack created D1780: REST API doc: drop heading /graph.
Sun, Jul 28, 7:11 PM
zack updated the diff for D1768: docker setup: simplify and uniform to SWH path conventions.
  • docker doc: add --publish to the run invocation
Sun, Jul 28, 7:05 PM
zack triaged T1937: nicer landing page for the swh-graph REST API as Low priority.
Sun, Jul 28, 7:04 PM · Graph service
zack triaged T1936: integrate swh-graph into the docker environment as Wishlist priority.
Sun, Jul 28, 6:56 PM · Docker environment, Graph service
zack updated the diff for D1768: docker setup: simplify and uniform to SWH path conventions.
  • docker doc: drop the list of files generated by Setup class
Sun, Jul 28, 5:37 PM

Thu, Jul 25

zack updated the diff for D1768: docker setup: simplify and uniform to SWH path conventions.
  • docker doc: further shorten CLI examples using relative paths
Thu, Jul 25, 5:02 PM
zack added a parent task for T1915: Add support for origin nodes in graph service API: T1867: compress Merkle DAG and origin nodes together.
Thu, Jul 25, 4:58 PM · Graph service
zack added a subtask for T1867: compress Merkle DAG and origin nodes together: T1915: Add support for origin nodes in graph service API.
Thu, Jul 25, 4:58 PM · Graph service
zack raised the priority of T1915: Add support for origin nodes in graph service API from Normal to High.

due to this bug (I suppose), trying to generate the various mapping files for a compressed graph that also includes swh:1:ori:... PIDs fails with:

Pre-computing node id maps...
Exception in thread "main" java.lang.IllegalArgumentException: Unknown SWH ID type in: swh:1:ori:4135fe80baeff9983f73e94b02da92f618cbb6c7
	at org.softwareheritage.graph.SwhId.<init>(SwhId.java:44)
	at org.softwareheritage.graph.backend.Setup.precomputeNodeIdMap(Setup.java:108)
	at org.softwareheritage.graph.backend.Setup.main(Setup.java:47)
Thu, Jul 25, 4:57 PM · Graph service
zack triaged T1933: bad invocation of o.s.graph.backend.Setup in docker doc as Low priority.
Thu, Jul 25, 4:55 PM · Graph service

Wed, Jul 24

zack edited reviewers for D1768: docker setup: simplify and uniform to SWH path conventions, added: seirl, haltode; removed: Reviewers.
Wed, Jul 24, 6:49 PM
Herald added a reviewer for D1768: docker setup: simplify and uniform to SWH path conventions: Reviewers.
Wed, Jul 24, 6:49 PM
zack updated the task description for T1930: swh-graph: ship swh-graph.jar in the docker container.
Wed, Jul 24, 7:54 AM · Graph service
zack committed rDGRPHed0a948477c4: docker doc: update names of mapping files (authored by zack).
docker doc: update names of mapping files
Wed, Jul 24, 7:54 AM
zack triaged T1930: swh-graph: ship swh-graph.jar in the docker container as Low priority.
Wed, Jul 24, 7:51 AM · Graph service

Sun, Jul 21

zack updated the task description for T1927: Web app: rate limiting based on per-client API tokens.
Sun, Jul 21, 4:11 PM · Web app
zack updated the task description for T1927: Web app: rate limiting based on per-client API tokens.
Sun, Jul 21, 4:10 PM · Web app
zack updated the task description for T1927: Web app: rate limiting based on per-client API tokens.
Sun, Jul 21, 4:10 PM · Web app
zack triaged T1927: Web app: rate limiting based on per-client API tokens as Normal priority.
Sun, Jul 21, 4:10 PM · Web app
zack updated subscribers of T1884: python bindings for compressed graph access.

Here's a first stab at an API for the py4j bindings that would be nice to use.

Sun, Jul 21, 2:31 PM · Graph service
zack triaged T1926: FUSE filesystem to navigate the archive as Wishlist priority.
Sun, Jul 21, 2:05 PM · Storage manager, Graph service

Fri, Jul 19

zack requested changes to D1753: Bypass edge restriction checks when edges=*.
Fri, Jul 19, 5:13 PM
zack requested changes to D1755: Add logging of endpoint timing.
Fri, Jul 19, 5:04 PM
zack added inline comments to D1755: Add logging of endpoint timing.
Fri, Jul 19, 4:20 PM
zack requested changes to D1755: Add logging of endpoint timing.

Instead of this, please add a pair of private methods to the Endpoint class, one to start timing (e.g., this.startTiming()), one to end it and return the diff w.r.t. the start time (e.g., this.stopTiming()) , and make all endpoint methods invoke the two methods and log the result.

Fri, Jul 19, 4:13 PM
zack requested changes to D1753: Bypass edge restriction checks when edges=*.

I was more thinking of lifting this up to the traversal algo, ideally not having to do a test at every edge that is followed.
Done this way, this is not gonna gain you much, only a couple of array lookup per edge.

Fri, Jul 19, 3:42 PM
zack updated subscribers of T1921: swh-graph: add logging of endpoint timing.
Fri, Jul 19, 10:08 AM · Graph service
zack updated the task description for T1922: swh-graph optimization: bypass edge restriction checks when edges=*.
Fri, Jul 19, 10:08 AM · Graph service
zack triaged T1922: swh-graph optimization: bypass edge restriction checks when edges=* as High priority.
Fri, Jul 19, 10:06 AM · Graph service
zack updated the task description for T1885: benchmark swh-graph use cases on the full graph.
Fri, Jul 19, 10:06 AM · Graph service
zack triaged T1921: swh-graph: add logging of endpoint timing as High priority.
Fri, Jul 19, 10:05 AM · Graph service

Jul 18 2019

zack triaged T1920: graph service: add tests for the python client as Normal priority.
Jul 18 2019, 4:27 PM · Graph service
zack committed rDGRPHb60f3aca7402: add code of conduct (authored by zack).
add code of conduct
Jul 18 2019, 4:22 PM
zack changed the status of T1851: Integrate graph-compression git repo in swh-environment from Open to Work in Progress.

in f13a43d697eb0d10ba59a4789847742607c49aaa I've added swh-graph to the mrconfig of swh-environment, let's see what the CI has to say about that… (cc: @douardda)

Jul 18 2019, 10:26 AM · Graph service
zack changed the status of T1851: Integrate graph-compression git repo in swh-environment, a subtask of T1887: publish swh-graph documentation at docs.s.o, from Open to Work in Progress.
Jul 18 2019, 10:26 AM · Development documentation, Graph service
zack committed rDENVf13a43d697eb: mrconfig: add shw-graph (authored by zack).
mrconfig: add shw-graph
Jul 18 2019, 10:24 AM
zack accepted D1743: New structure for the git repo.
Jul 18 2019, 10:21 AM
zack updated the summary of D1743: New structure for the git repo.
Jul 18 2019, 10:21 AM

Jul 17 2019

zack accepted D1741: Add javadoc documentation in graph service.
Jul 17 2019, 3:23 PM
zack added a comment to D1741: Add javadoc documentation in graph service.

Concerning links to swh-graph/docs it needs to wait for integration with swh-docs and then add external html links.

Jul 17 2019, 3:21 PM
zack requested changes to D1741: Add javadoc documentation in graph service.
Jul 17 2019, 2:51 PM