Page MenuHomeSoftware Heritage
Feed Advanced Search

Aug 26 2019

zack committed rDSNIP3e4f5ffa1c17: SQL graph export: remove ORDER BY to be more memory savvy (authored by zack).
SQL graph export: remove ORDER BY to be more memory savvy
Aug 26 2019, 10:34 AM
zack closed T1943: Publish swh-graph to PyPI as Resolved.
Aug 26 2019, 10:29 AM · Compressed graph service
zack closed T1887: publish swh-graph documentation at docs.s.o as Resolved.

now at https://docs.softwareheritage.org/devel/swh-graph/

Aug 26 2019, 10:28 AM · Documentation, Compressed graph service
zack closed T1904: build developer documentation for swh-graph as Resolved.
Aug 26 2019, 10:28 AM · Documentation, Compressed graph service
zack closed T1904: build developer documentation for swh-graph, a subtask of T1887: publish swh-graph documentation at docs.s.o, as Resolved.
Aug 26 2019, 10:28 AM · Documentation, Compressed graph service

Aug 25 2019

zack committed rDSNIP4e612e81696b: SQL snippet: add export of the archive as a graph (authored by zack).
SQL snippet: add export of the archive as a graph
Aug 25 2019, 6:20 PM
zack triaged T1968: existing graph endpoints should not return 404 upon missing arguments as Low priority.
Aug 25 2019, 2:50 PM · Easy hack, Compressed graph service

Aug 23 2019

zack raised the priority of T1967: REST server hangs when loading entire graph from High to Unbreak Now!.
Aug 23 2019, 8:13 PM · Compressed graph service
zack closed D1912: swh identify: add support for origin PIDs.
Aug 23 2019, 7:29 PM
zack committed rDMODfd2e6daef321: swh identify: add support for origin PIDs (authored by zack).
swh identify: add support for origin PIDs
Aug 23 2019, 7:29 PM
zack committed rDMOD880aff9d2d73: identifiers.py: add constants for 'swh:1' and sanitize namespace (authored by zack).
identifiers.py: add constants for 'swh:1' and sanitize namespace
Aug 23 2019, 7:29 PM
zack closed D1911: identifiers.py: add constants for 'swh:1' and sanitize namespace.
Aug 23 2019, 7:29 PM
zack added a comment to D1910: dev doc: source virtualenvwrapper.sh, as mkvirtualenv is gone from PATH.
In D1910#44117, @olasd wrote:

It's clear that the current instructions won't work when running all commands in succession (because virtualenvwrapper's config won't have been sourced unless the session has been restarted), so I guess this change is better than nothing.

Aug 23 2019, 7:17 PM
zack updated the summary of D1912: swh identify: add support for origin PIDs.
Aug 23 2019, 7:12 PM
zack added a comment to D1911: identifiers.py: add constants for 'swh:1' and sanitize namespace.

What is this for?

Aug 23 2019, 7:12 PM
zack accepted D1908: Release swh-graph 0.0.2.
Aug 23 2019, 7:09 PM
zack updated the summary of D1912: swh identify: add support for origin PIDs.
Aug 23 2019, 7:02 PM
Herald added a reviewer for D1912: swh identify: add support for origin PIDs: Reviewers.
Aug 23 2019, 7:02 PM
zack committed rDDOC3df1bd4419f9: dev doc: source virtualenvwrapper.sh, as mkvirtualenv is gone from PATH (authored by zack).
dev doc: source virtualenvwrapper.sh, as mkvirtualenv is gone from PATH
Aug 23 2019, 6:45 PM
zack closed D1910: dev doc: source virtualenvwrapper.sh, as mkvirtualenv is gone from PATH.
Aug 23 2019, 6:45 PM
Herald added a reviewer for D1911: identifiers.py: add constants for 'swh:1' and sanitize namespace: Reviewers.
Aug 23 2019, 6:35 PM
Herald added a reviewer for D1910: dev doc: source virtualenvwrapper.sh, as mkvirtualenv is gone from PATH: Reviewers.
Aug 23 2019, 4:19 PM
zack added a comment to D1908: Release swh-graph 0.0.2.

The "automatic" way is javadoc macro that updates the doc version each time it is modified, but I can also remove version and since attributes.

Aug 23 2019, 2:32 PM
zack added a comment to T1964: Timeout reached while assembling the requested bundle (Rocrail).

@zack: Thanks for providing the tarball. However, I in fact need a tarred up Git repo until the moment that GPL was revoked, as I want to re-pulish it via GitLab (or Github). Do you think it is possible to get a copy of that git repo any time soon?

Aug 23 2019, 2:29 PM · Vault
zack requested changes to D1908: Release swh-graph 0.0.2.

We certainly don't want to modify all Java files at each release.
So please either fix it the right way (which would probably mean some git subst var hackery) or at least for now just get rid of the version numbers from all Java source code - it's useless information anyway, as it belongs to the underlying VCS.

Aug 23 2019, 12:43 PM
zack triaged T1965: forge@softwareheritage.org: Recipient address rejected: User unknown in virtual mailbox table as High priority.
Aug 23 2019, 11:01 AM · System administration
zack added a comment to T1964: Timeout reached while assembling the requested bundle (Rocrail).

@sunweaver: did you see my answer above?

Aug 23 2019, 10:59 AM · Vault
zack added a comment to T1964: Timeout reached while assembling the requested bundle (Rocrail).

I can totally reproduce this issue.

Aug 23 2019, 9:23 AM · Vault
zack triaged T1964: Timeout reached while assembling the requested bundle (Rocrail) as Normal priority.
Aug 23 2019, 8:52 AM · Vault
zack accepted D1906: Fix tests using same Endpoint multiple times.
Aug 23 2019, 8:46 AM

Aug 22 2019

zack accepted D1900: server: benchmark: add nbEdgesAccessed in stats output.
Aug 22 2019, 7:24 PM

Aug 21 2019

zack accepted D1876: Fix indentation in docstrings..
Aug 21 2019, 11:57 AM
zack raised the priority of T593: ingest bitbucket hg/mercurial repositories from Normal to High.

Given the recent announcement by bitbucket about dropping mercurial support, the priority of this task has just increased.

Aug 21 2019, 10:11 AM · Archive coverage, Origin-Bitbucket

Aug 20 2019

zack accepted D1868: client: tests: add longer delay for server startup.
Aug 20 2019, 12:30 PM
zack updated the task description for T1957: Handling missing DAG nodes.
Aug 20 2019, 10:34 AM · Data Model

Aug 19 2019

zack added a comment to D1862: Allow -1 as Content length..

[ I'm jumping in here, but obviously I'm missing the IRL context. ]

Aug 19 2019, 5:53 PM
zack added inline comments to D1868: client: tests: add longer delay for server startup.
Aug 19 2019, 5:41 PM
zack accepted D1867: server: remove unnecessary 'visited.fill(false)' call.
Aug 19 2019, 5:37 PM
zack accepted D1866: client: tests add missing 'meta' information.

The "Build is green" review.

Aug 19 2019, 3:36 PM
zack accepted D1859: Benchmark: fix CSV log overwriting.

As discussed, please amend the commit before pushing to elaborate on why this is needed.

Aug 19 2019, 11:19 AM

Aug 18 2019

zack requested changes to D1857: Remove swh.graph from swh.docs' dependencies..

I disagree with this one. The right fix is making the doc build, not removing the dependency.

Aug 18 2019, 10:43 AM

Aug 16 2019

zack accepted D1856: Benchmark: add median value in stats.
Aug 16 2019, 6:21 PM
zack accepted D1855: Benchmark: output raw datapoints in CSV log file.
Aug 16 2019, 6:06 PM
zack added inline comments to D1855: Benchmark: output raw datapoints in CSV log file.
Aug 16 2019, 2:00 PM
zack requested changes to D1855: Benchmark: output raw datapoints in CSV log file.
Aug 16 2019, 1:57 PM

Aug 15 2019

zack accepted D1854: Use hash map instead of LongBigArray to backtrack.
Aug 15 2019, 6:41 PM

Aug 14 2019

zack accepted D1852: Add proper CLI args management in benchmarking tools.
Aug 14 2019, 9:37 PM
zack added inline comments to D1852: Add proper CLI args management in benchmarking tools.
Aug 14 2019, 9:36 PM
zack requested changes to D1852: Add proper CLI args management in benchmarking tools.
Aug 14 2019, 9:25 PM
zack requested changes to D1852: Add proper CLI args management in benchmarking tools.
Aug 14 2019, 4:59 PM
zack accepted D1851: Add provenance use-cases benchmarks.
Aug 14 2019, 2:08 PM
zack accepted D1850: Add input wrapper class for endpoints methods.
Aug 14 2019, 2:02 PM
zack accepted D1849: Add vault use-case benchmark.

I'm approving this as it's good enough. But please make the seed a real random seed rather than hard coded (in a subsequent commit). It's *useful* to have a fixed seed for reproducibility, but it should not be the default and it should be possible to pass it externally, e.g., as a class parameter and/or CLI option.

Aug 14 2019, 11:50 AM

Aug 13 2019

zack accepted D1846: Add browsing use-cases benchmarks.
Aug 13 2019, 2:48 PM
zack accepted D1843: Make the cypress build fail if any of the commands fails..
Aug 13 2019, 2:08 PM
zack added inline comments to D1846: Add browsing use-cases benchmarks.
Aug 13 2019, 10:50 AM
zack requested changes to D1846: Add browsing use-cases benchmarks.
Aug 13 2019, 8:28 AM

Aug 10 2019

zack renamed T1950: Reduce RAM usage for generating mapping files from Implement mapping files dumping with less RAM usage to Reduce RAM usage for generating mapping files.
Aug 10 2019, 3:45 PM · Compressed graph service

Aug 9 2019

zack accepted D1832: Endpoints now return timings instead of logging them.
Aug 9 2019, 10:03 AM

Aug 8 2019

zack removed a member for Staff: haltode.
Aug 8 2019, 4:50 PM
zack added a comment to D1832: Endpoints now return timings instead of logging them.
  • we should wrap the JSON return type more properly. Please use "result" instead of "content", everything else (like "timings" now, possibly other stuff in the future) will be metadata about the result

True, I will also rename "timings" into "metadata" because I don't like having both "Timing" and "Timings" class name.

Aug 8 2019, 4:43 PM
zack requested changes to D1832: Endpoints now return timings instead of logging them.

I hadn't in mind to lift the timings up to the REST layer, but why not, it will enable doing interesting stuff using non Java clients. However, a couple of change requests:

Aug 8 2019, 4:26 PM
zack renamed T1944: use a compact, binary format for node ids mapping files from More compact format for node ids mapping files to use a compact, binary format for node ids mapping files.
Aug 8 2019, 1:00 PM · Compressed graph service
zack added a comment to T1234: Allow simple read-only connections to db from swh nodes.

In the end the PGPASSFILE stuff cannot work with one file globally defined.
Because for some reason, the authors decided that it needs to be only user readable...

Aug 8 2019, 12:59 PM · System administration

Aug 7 2019

zack updated subscribers of T1942: Create shorter url redirecting to "Save code now".

Just wanted to mention here that @douardda and @olasd discussed getting a short URL style domain (I think it was swh.li, but I'm not entirety sure) for various shortening needs).

Aug 7 2019, 2:17 PM · Website
zack added a comment to T1234: Allow simple read-only connections to db from swh nodes.

This is great! Thanks a lot.

Aug 7 2019, 9:05 AM · System administration

Aug 6 2019

zack accepted D1822: Add detailed internal explanations in javadoc.
Aug 6 2019, 6:35 PM
zack accepted D1821: Add javadoc generation in swh-graph docs assets.
Aug 6 2019, 10:41 AM
zack requested changes to D1821: Add javadoc generation in swh-graph docs assets.

Can you also add a clean-javadoc target, equivalent of clean-images ?

Aug 6 2019, 10:08 AM

Aug 5 2019

zack accepted D1817: Add contextual info to compression script.
Aug 5 2019, 3:36 PM
zack requested changes to D1817: Add contextual info to compression script.
  • make the echo invocations a function, e.g., info()
  • use a common, but distinguishable prefix instead of just "#" which will allow to search for progress info in the potentially huge log-files, e.g., "* swh-graph:"
  • maybe add a step number v. total number of steps (e.g., 2/5)
Aug 5 2019, 3:26 PM
zack added a project to T1943: Publish swh-graph to PyPI: Compressed graph service.

Sure, let's do that, but I've never touched anything related to SWH on PyPI, so I've no idea how to make it happen.
Whoever has an idea of how do to that, please just go ahead. (And feel free to tag any recent version for PyPI publishing; the currently tagged version was completely arbitrary, just because a tag was needed for $something.)

Aug 5 2019, 1:07 PM · Compressed graph service
zack committed rDDOCf46f42f25943: sort requirements files for ease of inspection (authored by zack).
sort requirements files for ease of inspection
Aug 5 2019, 1:04 PM
zack closed D1783: sort requirements files for ease of inspection.
Aug 5 2019, 1:04 PM

Aug 3 2019

zack added a comment to T1884: python bindings for compressed graph access.

No objection from me on adding a more abstract node type. It would be a nicer API and, given it's gonna be on the python side only, it won't have any impact on perf.

Aug 3 2019, 8:30 AM · Compressed graph service

Aug 1 2019

zack accepted D1793: docs: fix formatting issues.
Aug 1 2019, 9:52 AM

Jul 31 2019

Herald added a reviewer for D1783: sort requirements files for ease of inspection: Reviewers.
Jul 31 2019, 11:24 AM
zack committed rDDOC0bf985b16688: requirements: add swh-graph to dev reqs too (authored by zack).
requirements: add swh-graph to dev reqs too
Jul 31 2019, 11:21 AM
zack committed rDDOCfc58f8b0d597: requirements: add swh-graph (authored by zack).
requirements: add swh-graph
Jul 31 2019, 11:20 AM
zack committed rDDOC1976b384a6b2: remove archiver (now gone) from index and glossary (authored by zack).
remove archiver (now gone) from index and glossary
Jul 31 2019, 11:19 AM
zack committed rDGRPH83c6bdb8d93c: docs/index.rst: write intro and link existing/dangling documents (authored by zack).
docs/index.rst: write intro and link existing/dangling documents
Jul 31 2019, 10:57 AM
zack committed rDDOCf04c6a88827e: link swh-graph from the doc index (authored by zack).
link swh-graph from the doc index
Jul 31 2019, 10:25 AM

Jul 30 2019

zack committed rDGRPHc883919c931d: dockerfile: add back /usr/bin/time invocation for resource monitoring (authored by zack).
dockerfile: add back /usr/bin/time invocation for resource monitoring
Jul 30 2019, 10:52 PM
zack added a comment to D1768: docker setup: simplify and uniform to SWH path conventions.

The mapping has always been a separate command but we can indeed add it to the generate_graph.sh script.

Jul 30 2019, 5:57 PM
zack closed D1768: docker setup: simplify and uniform to SWH path conventions.

landed in a043b0ee04aae50f8c26f6a06aac1e6c9247340a

Jul 30 2019, 5:56 PM
zack committed rDGRPH0d22852c1f4e: docker doc: drop the list of files generated by Setup class (authored by zack).
docker doc: drop the list of files generated by Setup class
Jul 30 2019, 5:54 PM
zack committed rDGRPH325ff999deb6: docker doc: add --publish to the run invocation (authored by zack).
docker doc: add --publish to the run invocation
Jul 30 2019, 5:54 PM
zack committed rDGRPH6212957e8f14: docker doc: further shorten CLI examples using relative paths (authored by zack).
docker doc: further shorten CLI examples using relative paths
Jul 30 2019, 5:54 PM
zack committed rDGRPHa043b0ee04aa: tests: update generate_graph.sh to match new docker layout (authored by zack).
tests: update generate_graph.sh to match new docker layout
Jul 30 2019, 5:54 PM
zack committed rDGRPHa428cd8c341f: docker setup: simplify and uniform to SWH path conventions (authored by zack).
docker setup: simplify and uniform to SWH path conventions
Jul 30 2019, 5:54 PM
zack added a comment to D1768: docker setup: simplify and uniform to SWH path conventions.

Note that you also need to update the java/server/src/test/dataset/generate_graph.sh script since the Docker environment has changed.

Jul 30 2019, 3:37 PM
zack updated the diff for D1768: docker setup: simplify and uniform to SWH path conventions.
  • tests: update generate_graph.sh to match new docker layout
Jul 30 2019, 3:32 PM

Jul 29 2019

zack added a comment to T1920: graph service: add tests for the python client.

We want to test that the client part of a complete client<->server interaction works properly. The best way to do that is, in fact, to rerun the same tests we run on the server side, but via the Python client. If there is an easy way to just reuse the same test code (e.g., by generating Python tests from the Java ones, or vice-versa), go for it. But probably it isn't worth it, as there isn't much test code anyway. If there is no way to keep parity, we should go for something minimal on the Python side, e.g., just test one call per API endpoint, and keep a more complete coverage on the Java side (again: or vice-versa, if we prefer to maintain the Python test code base than the Java one).

Jul 29 2019, 10:03 PM · Compressed graph service
zack planned changes to D1768: docker setup: simplify and uniform to SWH path conventions.

Tnx, will do.

Jul 29 2019, 4:58 PM
zack abandoned D1780: REST API doc: drop heading /graph.

No, you're right, I didn't think of the global namespace of unified documentation.

Jul 29 2019, 4:57 PM
zack accepted D1782: Remove visit/ endpoint and keep only: visit/nodes visit/paths.
Jul 29 2019, 4:55 PM
zack accepted D1781: Add 'origin' node type.
Jul 29 2019, 12:44 PM
zack accepted D1753: Bypass edge restriction checks when edges=*.
Jul 29 2019, 10:34 AM
zack accepted D1755: Add logging of endpoint timing.
Jul 29 2019, 9:45 AM

Jul 28 2019

zack triaged T1938: swh-graph: NullPointerException upon (wrong) /walk from cnt to snp as Normal priority.
Jul 28 2019, 7:22 PM · Compressed graph service