Closing this as the new release of aiohttp (3.6.3) mitigates the issue mentioned above and the CI build of swh-graph is now fixed.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jan 8 2021
Jan 7 2021
Dec 17 2020
Nov 23 2020
Nov 12 2020
Nov 10 2020
Oct 13 2020
Oct 1 2020
Sep 30 2020
Sep 29 2020
I managed to track the dependency bump that broke the CI build: it is the upgrade of yarl (dependency of aiohttp) from 1.5.1 to 1.6.0. The issue is surely due to that commit.
Sep 26 2020
Sep 25 2020
Sep 22 2020
Sep 21 2020
Sep 19 2020
this has been fixed a while ago by D2669
Sep 18 2020
In T2589#49073, @anlambert wrote:You are right, they are not stored in database but there is a storage.origin_get_by_sha1 method.
However, unless I'm missing something, I think right now origin sha1s are not stored at all in swh-storage, or are they?
If they indeed aren't, a required sub-task of this one is adding sha1s to the origin table, together with an index to do the reverse sha1 -> url, and a matching swh-storage API method.
Sep 17 2020
In T2589#48566, @anlambert wrote:
- We can process swh-graph responses to enrich the data (notably get origin urls from their sha1 or turn swhids into dicts) and returns them in JSON format
Sep 16 2020
No, only the edge part is done, we still need a parquet and a CSV exporter :/
It is already running on granet :-)
I think this is (reasonably) done now, please check and close it.
We have now a newer version of the compressed graph (2020-05-20), but it's not yet running on granet (I *think*, and, lacking T2579, I haven't checked).
Please make granet run that version of this task and close this task. (Or just close this task if it's already done.)
FTR, in a previous life, I've set up a json web token auth validation in varnish.
Sep 15 2020
Must update the quickstart documentation guide once implemented.
Sep 14 2020
I agree with @olasd to do the reverse proxy at the webapp level. The main advantages are:
- We can use the same Wep API authentication backend to manage authentication and user permissions. API authentication is based on the use of an OIDC offline refresh token and access token renewal is handled in the Django DRF authentication backend. While it should be possible to implement that process at reverse proxy level, users filtering should not be as easy as using fine-grained permissions from Django User API.
- We can process swh-graph responses to enrich the data (notably get origin urls from their sha1 or turn swhids into dicts) and returns them in JSON format
So, my first instinct for this was to implement the "mount" at the reverse proxy level (before even hitting swh-web), but:
Sep 9 2020
Sep 7 2020
Sep 6 2020
Noting down that I had a tentative very preliminary implementation in the feature/fuse branch of swh-graph; see in particular fuse.py there.
It's probably no worth picking up and we should restart from scratch at this point, but might still contain useful material.
(The webclient in there has since become a proper thing, see T2279. So that part is definitely obsolete.)
Sep 4 2020
closed by 8c937da20785699ae2a0a604104a9e458eced201
Aug 24 2020
Jul 23 2020
Just to be clear, the problem here wasn't directly linked to swhgraphshm but simply to the amount of available memory, because the MAP_PRIVATE flag tried to reserve all that memory to be able to perform copy on write. Using MAP_SHARED + PROT_READ avoids having this memory reservation and fixes the issue. swhgraphshm was just a random process taking a lot of the available ram, not specifically the reason why it failed.
The ZFS ARC (zfs's page cache) is set to grow without bounds.
Jul 15 2020
I think it's related to the shm trick.
Jun 3 2020
Feb 14 2020
Feb 13 2020
Feb 7 2020
Would this param replace /last altogether as it would be equivalent to ?limit=1 or are they mutually exclusive ?
Jan 29 2020
Jan 22 2020
Jan 8 2020
I just ran it on Azure. It has a different schema (the "revision" table with split into "revision" and "revision_parent") so the benchmarks are not exactly comparable.
I still use 16 workers, all running on the same machine, and with no compression
Dec 6 2019
Nov 30 2019
Nov 27 2019
proposed CLI interface:
swh [ -C config.yml ] graph mount PID DIR
will mount the content of the given PID to the given local DIR.
Nov 25 2019
Sure, it's as simple as you'd expect:
can you also post it here, please?
Nov 24 2019
It's up on granet, not committed yet because it should be included in a puppet integration diff.
Nov 22 2019
Launched on somerset:
We should consider just adding a btree index on sha1(url) and see where that takes us.
Forgot to mention it here, but it's done now.
Nov 19 2019
Nov 18 2019
That's now also been deployed.