Sure, let's do that, but I've never touched anything related to SWH on PyPI, so I've no idea how to make it happen.
Whoever has an idea of how do to that, please just go ahead. (And feel free to tag any recent version for PyPI publishing; the currently tagged version was completely arbitrary, just because a tag was needed for $something.)
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Aug 5 2019
Due to multiple server maintenance, the process was re-started a few times, but it is now finished and results are uploaded in the annex: https://annex.softwareheritage.org/public/dataset/graph/latest/compressed/all+ori/
Aug 3 2019
What was tried so far:
No objection from me on adding a more abstract node type. It would be a nicer API and, given it's gonna be on the python side only, it won't have any impact on perf.
Aug 2 2019
Sounds good. No 'node' type then, we just use IDs? Maybe a Node type would allow to do stuff like neighbors() directly on the Node instance?
Aug 1 2019
Jul 31 2019
Jul 30 2019
One test call per endpoint seems enough right now, the Python side relies entirely on swh custom REST API class for client <-> server interaction (which has tests on its own).
Jul 29 2019
We want to test that the client part of a complete client<->server interaction works properly. The best way to do that is, in fact, to rerun the same tests we run on the server side, but via the Python client. If there is an easy way to just reuse the same test code (e.g., by generating Python tests from the Java ones, or vice-versa), go for it. But probably it isn't worth it, as there isn't much test code anyway. If there is no way to keep parity, we should go for something minimal on the Python side, e.g., just test one call per API endpoint, and keep a more complete coverage on the Java side (again: or vice-versa, if we prefer to maintain the Python test code base than the Java one).
Running swh-graph (with only the default graph not its transposed) requires ~125GB of RAM.
What kind of tests do we want for the client side code? Checking the resulting json format for each endpoints?
Yes this is related, there are safe checks when creating a new SwhId from a string form, and right now the code for the type looks like:
Jul 28 2019
Jul 25 2019
due to this bug (I suppose), trying to generate the various mapping files for a compressed graph that also includes swh:1:ori:... PIDs fails with:
Pre-computing node id maps... Exception in thread "main" java.lang.IllegalArgumentException: Unknown SWH ID type in: swh:1:ori:4135fe80baeff9983f73e94b02da92f618cbb6c7 at org.softwareheritage.graph.SwhId.<init>(SwhId.java:44) at org.softwareheritage.graph.backend.Setup.precomputeNodeIdMap(Setup.java:108) at org.softwareheritage.graph.backend.Setup.main(Setup.java:47)
Jul 24 2019
Jul 21 2019
Here's a first stab at an API for the py4j bindings that would be nice to use.
Jul 19 2019
Jul 18 2019
in f13a43d697eb0d10ba59a4789847742607c49aaa I've added swh-graph to the mrconfig of swh-environment, let's see what the CI has to say about that… (cc: @douardda)
Jul 16 2019
*SAD TROMBONE*.
From the sphinx-maven plugin documentation :
Jul 15 2019
Jul 14 2019
Jul 12 2019
Jul 11 2019
Closed by 946d235ebdac.
More work needs to be done after D1700 (integrate with node->type map, better isolation for swh id/longs, etc.), so I'm leaving this open.
Jul 10 2019
Jul 9 2019
this has been started on sexus yesterday, ETA: next monday-ish
Done in D1699.
Done in D1698.