Page MenuHomeSoftware Heritage
Feed Advanced Search

Dec 11 2020

zack committed rDDOCfc4f1b16010b: archive changelog: document GitLab(s), GNU, and NPM ingestion (authored by zack).
archive changelog: document GitLab(s), GNU, and NPM ingestion
Dec 11 2020, 3:06 PM
zack committed rDDOCbeadb19a3d4e: archive journal: simplify Sphinx markup for single-world links (authored by zack).
archive journal: simplify Sphinx markup for single-world links
Dec 11 2020, 3:06 PM
zack closed T1139: ingest major gitlab instances as Resolved.

All instances listed in this task have been added, so I'm closing this. Other instances can be added in the future, submitting matching task for tracking reasons as needed.

Dec 11 2020, 2:35 PM · Archive coverage, Origin-GitLab
zack accepted D4696: Add "swh web search" command to perform archive searches via the CLI.

LGTM! just added a few minor comments that can be taken care of before landing

Dec 11 2020, 12:36 PM
zack accepted D4718: Rewrite of the export pipeline using Exporters.
Dec 11 2020, 12:19 PM
zack accepted D4707: graph export: handle labels.
Dec 11 2020, 12:12 PM
zack accepted D4633: Add a documentation/specification of the journal messages formats.

LGTM in general.

Dec 11 2020, 11:37 AM

Dec 10 2020

zack added a comment to T2793: add notable past events to the archive changelog.

I've completed a bunch of these in 86f8b213e23970feb9f9bda8ab87fc7d6851abf0

Dec 10 2020, 4:55 PM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 4:54 PM · Archive coverage, Documentation
zack committed rDDOC86f8b213e239: archive changelog: add IPOL, Bitbucket, CRAN, Save Code Now, PyPI, HAL (authored by zack).
archive changelog: add IPOL, Bitbucket, CRAN, Save Code Now, PyPI, HAL
Dec 10 2020, 4:54 PM
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 4:21 PM · Archive coverage, Documentation
zack requested changes to D4696: Add "swh web search" command to perform archive searches via the CLI.
Dec 10 2020, 2:17 PM
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:30 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:28 AM · Archive coverage, Documentation
zack closed T192: analyze 4 loading failures for GNU tarballs and reimport them as Wontfix.

Closing as long-stale. Further investigation should be done when we restart periodic GNU release ingestion.

Dec 10 2020, 11:19 AM · Tarball loader
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:17 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:14 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:12 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:10 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:07 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:04 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:03 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:01 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 10:58 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 10:54 AM · Archive coverage, Documentation
zack closed T682: Ingest Google Code Mercurial repositories, a subtask of T367: ingest Google Code repositories, as Resolved.
Dec 10 2020, 10:52 AM · Archive coverage, Restricted Project
zack closed T682: Ingest Google Code Mercurial repositories as Resolved.
Dec 10 2020, 10:52 AM · Archive coverage, Mercurial loader

Dec 9 2020

zack added inline comments to D4698: Add support for ExtID in the storage.
Dec 9 2020, 6:40 PM

Dec 8 2020

zack triaged T2869: web search: allow to filter by origin type as Normal priority.
Dec 8 2020, 6:29 PM · Web app
zack accepted D4689: FUSE: fs: lookup: add optional regexp name validation.
Dec 8 2020, 5:19 PM
zack requested changes to D4689: FUSE: fs: lookup: add optional regexp name validation.
Dec 8 2020, 5:12 PM
zack created P895 Command-Line Input.
Dec 8 2020, 2:48 PM
zack added a project to T2822: add "swh web search" command to perform archive searches via the CLI: Software Heritage filesystem.
Dec 8 2020, 11:48 AM · Software Heritage filesystem, Web client
zack renamed T2843: FUSE: multiple origin visits on the same day are ignored from Multiple visits on the same day are ignored to FUSE: multiple origin visits on the same day are ignored.
Dec 8 2020, 11:34 AM · Software Heritage filesystem
zack accepted D4676: FUSE: fs: rework archive/ visibility and content.
Dec 8 2020, 9:37 AM

Dec 7 2020

zack requested changes to D4676: FUSE: fs: rework archive/ visibility and content.
Dec 7 2020, 10:32 PM
zack added a comment to T2849: Design and implement a mapping from "original VCS ids" to SWHIDs to help incremental loaders.

Thanks, this would be an important new feature. Some comments/random thoughts below.

Dec 7 2020, 9:08 PM · Storage manager
zack updated the task description for T2854: docker: Service Startup Error.
Dec 7 2020, 11:18 AM · Docker environment, Development environment
zack triaged T2854: docker: Service Startup Error as High priority.
Dec 7 2020, 11:16 AM · Docker environment, Development environment

Dec 4 2020

zack added a comment to T2619: Make the front page "archive size" graphs consistent with one another.

go for the internet archive version, and i think the points you have are enough, even if there are bumps it's much better than what we currently have !

Dec 4 2020, 4:42 PM · Web app
zack added a comment to T2851: FUSE: directories referencing artifacts missing from the archive are reported as empty.

good catch!, a broken symlink would be preferable over omitting the entry

Dec 4 2020, 1:50 PM · Software Heritage filesystem
zack renamed T2851: FUSE: directories referencing artifacts missing from the archive are reported as empty from FUSE: directory completely empty when one artifact is missing from the archive to FUSE: directories referencing artifacts missing from the archive are reported as empty.
Dec 4 2020, 1:49 PM · Software Heritage filesystem
zack committed rMSLDa03ac1c909b9: Add 2020-12-02-swhfs-inria slides (authored by haltode).
Add 2020-12-02-swhfs-inria slides
Dec 4 2020, 10:10 AM
zack closed D4660: Add 2020-12-02-swhfs-inria slides.
Dec 4 2020, 10:10 AM
zack accepted D4660: Add 2020-12-02-swhfs-inria slides.
Dec 4 2020, 10:02 AM

Dec 3 2020

zack committed rDFUSE141277e4f536: tutorial: update snapshot example to the current nested branch layout (authored by zack).
tutorial: update snapshot example to the current nested branch layout
Dec 3 2020, 6:49 PM
zack accepted D4652: fuse: add dedicated logger instead of root logger.
Dec 3 2020, 6:28 PM
zack requested changes to D4652: fuse: add dedicated logger instead of root logger.
Dec 3 2020, 4:41 PM
zack accepted D4657: fuse: origin: properly handle invalid origin URL.
Dec 3 2020, 4:27 PM
zack added a comment to T2771: FUSE: rethink the visibility of files under archive/ and meta/, and possibly add a new cache/ entrypoint.
In T2771#53972, @seirl wrote:

We also need to discuss what exactly we put in cache/. I thought about symlinks to archive/ and meta/, what do you think? Removing the symlinks also means removing the data from the cache.

Dec 3 2020, 2:01 PM · Software Heritage filesystem
zack renamed T2771: FUSE: rethink the visibility of files under archive/ and meta/, and possibly add a new cache/ entrypoint from FUSE: shard entries returned by ls {archive,meta}/, hiding {archive,meta}/SWHID entries to FUSE: rethink the visibility of files under archive/ and meta/, and possibly add a new cache/ entrypoint.
Dec 3 2020, 1:40 PM · Software Heritage filesystem
zack added a comment to T2771: FUSE: rethink the visibility of files under archive/ and meta/, and possibly add a new cache/ entrypoint.

New proposal (lather, rinse, repeat…) based on an idea from @seirl:

Dec 3 2020, 1:37 PM · Software Heritage filesystem
zack added a project to T2834: Use msgpack extension types instead of custom swh encoders/decoders: Journal.
Dec 3 2020, 1:30 PM · Journal
zack added a comment to T2848: web app: use SWHIDs everywhere in metadata view.

Sounds good to me.

Dec 3 2020, 12:57 PM · Metadata workflow, Web app
zack added a comment to T2848: web app: use SWHIDs everywhere in metadata view.

Oh, good point about the SWHIDs being already available as Permalinks.
I don't know exactly what you mean with "metadata associated to a SWH object", but I'm certainly in favor of reducing information duplication.

Dec 3 2020, 12:40 PM · Metadata workflow, Web app
zack added a comment to D4653: client: add origin_exists() method.

as discussed on IRC, if the other *_exists() methods have unit tests, please add one also for this new method before landing

Dec 3 2020, 12:37 PM
zack accepted D4653: client: add origin_exists() method.
Dec 3 2020, 12:35 PM

Dec 2 2020

zack triaged T2848: web app: use SWHIDs everywhere in metadata view as Normal priority.
Dec 2 2020, 7:40 PM · Metadata workflow, Web app
zack added a comment to T2842: Hang while running ls on a directory containing a symlink.

@vlorentz: can we have the logs please?
Run mount with the "--foreground" option and/or check your user log in "journalctl --user".
TIA

Dec 2 2020, 3:33 PM · Software Heritage filesystem
zack added a reviewer for D4648: jobs/swh-environment: Mitigate backtracking issue with new pip resolver: Reviewers.
Dec 2 2020, 2:58 PM
zack added a comment to T2841: FUSE: update cache with new origin visits.

The difficulty with this one is deciding when to re-query the backend to check if there are new visits. Doing it too often will make the cache of visit metadata useless. Doing it too seldomly will make you miss new visits. Either way, we probably need to add a timestamp somewhere in the cache to note down when the metadata have been fetched last (!= most recent visit timestamp).

Dec 2 2020, 2:53 PM · Software Heritage filesystem
zack added a comment to T2838: "swh fs mount" silently fails if fusermount3 isn't available.

nice catch, thanks @vlorentz.

Dec 2 2020, 2:33 PM · Software Heritage filesystem
zack added a project to T2836: swh scanner db import loads keeps all input SWHIDs in memory: Easy hack.
Dec 2 2020, 9:26 AM · Easy hack, Code scanner
zack triaged T2836: swh scanner db import loads keeps all input SWHIDs in memory as Normal priority.
Dec 2 2020, 9:26 AM · Easy hack, Code scanner

Dec 1 2020

zack added a reviewer for D4631: fs: snapshot: nest branch names as directories instead of URL-escaping: Reviewers.
Dec 1 2020, 5:25 PM
zack accepted D4637: FUSE: pre-commit: codespell: exclude tests generated data.
Dec 1 2020, 12:34 PM
zack added inline comments to D4631: fs: snapshot: nest branch names as directories instead of URL-escaping.
Dec 1 2020, 11:42 AM
zack accepted D4636: FUSE: code and docs various cleanup.

I'm accepting this, but please fix the various minor issues I've pointed out before landing.

Dec 1 2020, 11:40 AM
zack triaged T2832: FUSE: add tests for CLI commands as Low priority.
Dec 1 2020, 11:32 AM · Software Heritage filesystem
zack renamed T2771: FUSE: rethink the visibility of files under archive/ and meta/, and possibly add a new cache/ entrypoint from FUSE: make ls archive/ meta/ return no result to FUSE: shard entries returned by ls {archive,meta}/, hiding {archive,meta}/SWHID entries.
Dec 1 2020, 11:31 AM · Software Heritage filesystem

Nov 30 2020

zack accepted D4632: FUSE: tests: various code cleanup.
Nov 30 2020, 5:28 PM
zack accepted D4628: FUSE: tests: use fixed delay in test_list_history.
Nov 30 2020, 5:05 PM
zack renamed T2828: Archive counters are no longer updated in production from Production counters not up to date to Archive counters are no longer updated in production.
Nov 30 2020, 4:02 PM · Monitoring, Web app, System administration
zack raised the priority of T2828: Archive counters are no longer updated in production from High to Unbreak Now!.
Nov 30 2020, 4:02 PM · Monitoring, Web app, System administration
zack committed rMSLD7c58eaaf660d: add slides for swh-scanner talk at Open Compliance Summit 2020 (authored by zack).
add slides for swh-scanner talk at Open Compliance Summit 2020
Nov 30 2020, 1:31 PM

Nov 28 2020

zack triaged T2825: add origin (and search) example to the FUSE tutorial as Low priority.
Nov 28 2020, 1:27 PM · Documentation, Software Heritage filesystem
zack renamed T2824: Add tag to the deposit protocol to allow client to specify a parent deposit from Add tag in the deposit protocol to allow client to specify a parent deposit to Add tag to the deposit protocol to allow client to specify a parent deposit.
Nov 28 2020, 1:25 PM · SWORD deposit

Nov 27 2020

zack accepted D4628: FUSE: tests: use fixed delay in test_list_history.

Please first file a task about this issue and reference it from the comment in the code, for ease of tracking.

Nov 27 2020, 6:01 PM
zack updated the task description for T2811: FUSE: fix various paper cuts (user testing 2020-11-24).
Nov 27 2020, 1:37 PM · Software Heritage filesystem
zack triaged T2822: add "swh web search" command to perform archive searches via the CLI as Normal priority.
Nov 27 2020, 1:37 PM · Software Heritage filesystem, Web client
zack updated the task description for T2811: FUSE: fix various paper cuts (user testing 2020-11-24).
Nov 27 2020, 12:23 PM · Software Heritage filesystem
zack updated the task description for T2811: FUSE: fix various paper cuts (user testing 2020-11-24).
Nov 27 2020, 12:20 PM · Software Heritage filesystem
zack updated the task description for T2811: FUSE: fix various paper cuts (user testing 2020-11-24).
Nov 27 2020, 12:15 PM · Software Heritage filesystem
zack renamed T2820: FUSE: nest branch names in snapshot views instead of URL-escaping slashes from FUSE: next branch names in snapshot views instead of URL-escaping slashes to FUSE: nest branch names in snapshot views instead of URL-escaping slashes.
Nov 27 2020, 12:15 PM · Software Heritage filesystem
zack triaged T2820: FUSE: nest branch names in snapshot views instead of URL-escaping slashes as Low priority.
Nov 27 2020, 12:15 PM · Software Heritage filesystem
zack updated the task description for T2811: FUSE: fix various paper cuts (user testing 2020-11-24).
Nov 27 2020, 12:10 PM · Software Heritage filesystem
zack triaged T2819: logger: allow to specify/override the loglevel for a specific module as Normal priority.
Nov 27 2020, 12:09 PM · Core & foundations
zack accepted D4617: meta.json: add 'json-indent' option (default to 2).
Nov 27 2020, 10:21 AM
zack accepted D4617: meta.json: add 'json-indent' option (default to 2).

(but please change the default indent level as suggested before landing)

Nov 27 2020, 10:14 AM

Nov 26 2020

zack accepted D4598: docs: config: add logging section.
Nov 26 2020, 11:25 AM
zack committed rDDOC3bc9848ddff8: archive journal: fix sphinx markup errors making links 404 (authored by zack).
archive journal: fix sphinx markup errors making links 404
Nov 26 2020, 11:23 AM
zack accepted D4597: docs: cli: add explicit anchor.
Nov 26 2020, 11:16 AM
zack accepted D4596: Clean CLI config logging.
Nov 26 2020, 10:56 AM

Nov 25 2020

zack triaged T2813: swh scanner db import does not validate SWHIDs as Low priority.
Nov 25 2020, 10:37 PM · Code scanner
zack triaged T2812: scanner import db is slow, improve its performances as Low priority.
Nov 25 2020, 10:00 PM · Code scanner
zack removed hashtags from Software Heritage filesystem: #fuse_virtual_file_system, #user-space_filesystem.
Nov 25 2020, 5:11 PM
zack renamed Software Heritage filesystem from User-space filesystem to Software Heritage filesystem.
Nov 25 2020, 5:11 PM
zack renamed Software Heritage filesystem from user-space filesystem to User-space filesystem.
Nov 25 2020, 5:10 PM
zack renamed Software Heritage filesystem from FUSE virtual file system to user-space filesystem.
Nov 25 2020, 5:09 PM
zack added a revision to T2802: FUSE: avoid logging normal conditions like ENOENT: D4594: fuse: lookup: do not log ENOENT.
Nov 25 2020, 5:07 PM · Software Heritage filesystem