Finally, more concentrated frequency dict:
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 6 2022
It must be more interesting to read it with a frequency [1]:
Out of the paste [1] (csv extract from swh-scheduler dev db after 3 lister runs on
docker), here is the state of detected files [2] so far (computed with [3]):
Oct 5 2022
Oct 4 2022
Another data point about ^. It's not important for the guix manifest [1]. We can keep a
compatible behavior for it and slightly improve the listing behavior for nixpkgs as it's
important for those [2].
With the gazillion of new diffs on top of the origin lister code, we can now also list
the nixpkgs-unstable-full.json manifests [1]
Another one bites the dust [1]
For the content loader i have mostly checksums mismatches [1].
It seems the integrity from the manifest is either wrong (or some in-place update occurred in the respective servers [2])
Oct 3 2022
Run through docker for directory:
docker run on the lister:
17:36:23 swh-scheduler@localhost:5433=# select now(), visit_type, lister_id, count(*) from listed_origins where lister_id = ( select id from listers where name='nixguix' and instance_name='nix-community.github.io') group by visit_type, lister_id; +-------------------------------+------------+--------------------------------------+-------+ | now | visit_type | lister_id | count | +-------------------------------+------------+--------------------------------------+-------+ | 2022-10-03 15:44:20.179895+00 | git | 3f5c040a-6247-4ef3-a812-36f4b9ceafeb | 1 | | 2022-10-03 15:44:20.179895+00 | file | 3f5c040a-6247-4ef3-a812-36f4b9ceafeb | 87 | | 2022-10-03 15:44:20.179895+00 | tar | 3f5c040a-6247-4ef3-a812-36f4b9ceafeb | 31130 | +-------------------------------+------------+--------------------------------------+-------+ (3 rows)
Sep 30 2022
Sep 29 2022
Sep 23 2022
Hum, for the 7 false, I have to check. For the 88 packages with no-origin, it is more
annoying. Well, some are metapackages as gcc-toolchain, so they can be skipped. Is it
ok for you to let this 'no-origin' type? For some others, I have to check if they are
covered elsewhere.
For ^, something like this would do [1]
Thanks for all that ^! And great pointers!
- artifacts url which are mostly tarballs [1] and sometimes files [2]
- dvcs repositories delegated to dedicated loader to ingestion: svn [3], hg [4], git [5] (out of guix manifest)
- Other stuff can be ignored as we don't have anything relevant to ingest [6]
Sep 7 2022
Sep 6 2022
Some more information regarding extensions supported in nixpkgs and guix manifests:
In [33]: sources = "https://nix-community.github.io/nixpkgs-swh/sources-unstable.json"
Aug 30 2022
Jul 1 2022
Jun 30 2022
Jun 29 2022
Jun 28 2022
So taking a bit more look into this possible new lister, we'd end up with the following
possible outputs:
- artifacts url which are mostly tarballs [1] and sometimes files [2]
- dvcs repositories delegated to dedicated loader to ingestion: svn [3], hg [4], git [5] (out of guix manifest)
- Other stuff can be ignored as we don't have anything relevant to ingest [6]
May 25 2022
Another argument: currently, there is always at least some failures when loading real
Nix and Guix repositories, so visits always have status partial; which prevents them
from being listed in
https://archive.softwareheritage.org/browse/search/?q=&with_visit=true&with_content=true&visit_type=nixguix
(but we get results when un-checking " only show origins visited at least once")
Another argument: currently, there is always at least some failures when loading real Nix and Guix repositories, so visits always have status partial; which prevents them from being listed in https://archive.softwareheritage.org/browse/search/?q=&with_visit=true&with_content=true&visit_type=nixguix (but we get results when un-checking " only show origins visited at least once")
Mar 25 2022
Mar 23 2022
Mar 16 2022
swh-model 5.0.0 released, which finalizes these changes