Page MenuHomeSoftware Heritage

vlorentz (Valentin Lorentz)
User

User Details

User Since
Oct 1 2018, 11:23 AM (218 w, 4 d)

Recent Activity

Thu, Dec 8

vlorentz requested review of D8947: Document statsd metrics and link to dashboards.
Thu, Dec 8, 1:20 PM
vlorentz closed D8935: Add dataset name to the export id.
Thu, Dec 8, 11:47 AM
vlorentz committed rDGRPH94b1d2c14fe8: Add dataset name to the export id (authored by vlorentz).
Add dataset name to the export id
Thu, Dec 8, 11:47 AM
vlorentz closed T4354: Contribute terms to ForgeFed as Resolved.
Thu, Dec 8, 11:44 AM · Archive search, Metadata workflow
vlorentz closed T4354: Contribute terms to ForgeFed, a subtask of T4249: Choose/define an ontology to use for indexed extrinsic origin metadata, as Resolved.
Thu, Dec 8, 11:44 AM · Archive search, Metadata workflow
vlorentz updated the task description for T4354: Contribute terms to ForgeFed.
Thu, Dec 8, 11:44 AM · Archive search, Metadata workflow
vlorentz accepted D8944: replay: Copy dir states and external paths in copy_from operations.
Thu, Dec 8, 11:39 AM
vlorentz accepted D8882: replay: Do not ignore externals in copyfrom operations.
Thu, Dec 8, 11:38 AM
vlorentz accepted D8941: replay: Simplify FileEditor implementation.

huh, nice

Thu, Dec 8, 11:38 AM
vlorentz added a comment to D8944: replay: Copy dir states and external paths in copy_from operations.

Are you sure the path argument to add_directory cannot start with a / or contain ..?

Thu, Dec 8, 11:36 AM
vlorentz accepted D8942: utils: Raise ValueError when external definition could not be parsed.
Thu, Dec 8, 11:33 AM
vlorentz added a comment to D8939: Rework the replaying exception handling.

Could you use a logger instance, and add if logger.isEnabledFor(logging.DEBUG): before logger.debug statements that use hash_to_hex?

Thu, Dec 8, 11:31 AM
vlorentz accepted D8946: svn_retry: Reduce max number of retry attempts to 3.
Thu, Dec 8, 2:29 AM
vlorentz accepted D8945: api, browse: Ensure to sanitize filename passed to django FileResponse.
Thu, Dec 8, 2:29 AM
vlorentz requested changes to D8943: svn: Use urllib.parse.quote to percent encode svn URLs.
Thu, Dec 8, 2:27 AM

Wed, Dec 7

vlorentz closed D8932: Replace RunAll with RunExportCompressUpload.
Wed, Dec 7, 5:15 PM
vlorentz committed rDGRPHcd69e48b5acc: Replace RunAll with RunExportCompressUpload (authored by vlorentz).
Replace RunAll with RunExportCompressUpload
Wed, Dec 7, 5:15 PM
vlorentz closed D8931: Prevent incorrect warning from being printed to output files.
Wed, Dec 7, 5:15 PM
vlorentz committed rDGRPH233b0508395a: Prevent incorrect warning from being printed to output files (authored by vlorentz).
Prevent incorrect warning from being printed to output files
Wed, Dec 7, 5:15 PM
vlorentz committed rDGRPH042af3adf5b6: Fix crash when the sensitive dataset directory does not exist (authored by vlorentz).
Fix crash when the sensitive dataset directory does not exist
Wed, Dec 7, 5:15 PM
vlorentz requested review of D8935: Add dataset name to the export id.
Wed, Dec 7, 4:57 PM
vlorentz closed D8934: Remove tool ids from Kafka messages.
Wed, Dec 7, 4:34 PM
vlorentz committed rDCIDXe8549400bc54: Remove tool ids from Kafka messages (authored by vlorentz).
Remove tool ids from Kafka messages
Wed, Dec 7, 4:34 PM
vlorentz updated the diff for D8932: Replace RunAll with RunExportCompressUpload.

rebase

Wed, Dec 7, 3:24 PM
vlorentz updated the diff for D8931: Prevent incorrect warning from being printed to output files.

I'm tired

Wed, Dec 7, 3:24 PM
vlorentz updated the diff for D8932: Replace RunAll with RunExportCompressUpload.

rebase

Wed, Dec 7, 3:18 PM
vlorentz updated the diff for D8931: Prevent incorrect warning from being printed to output files.

remove useless function

Wed, Dec 7, 3:18 PM
vlorentz updated the diff for D8932: Replace RunAll with RunExportCompressUpload.

rebase

Wed, Dec 7, 3:10 PM
vlorentz updated the diff for D8931: Prevent incorrect warning from being printed to output files.

less awful fix

Wed, Dec 7, 3:09 PM
vlorentz planned changes to D8931: Prevent incorrect warning from being printed to output files.
Wed, Dec 7, 2:44 PM
vlorentz added a comment to D8931: Prevent incorrect warning from being printed to output files.
In D8931#232231, @olasd wrote:

Why not just touch all the files?

Wed, Dec 7, 2:43 PM
vlorentz created P1541 (An Untitled Masterwork).
Wed, Dec 7, 2:31 PM
vlorentz requested review of D8934: Remove tool ids from Kafka messages.
Wed, Dec 7, 2:20 PM
vlorentz added a revision to T4719: indexer storage crashes on kafka errors, because of integers in the key: D8934: Remove tool ids from Kafka messages.
Wed, Dec 7, 2:08 PM · Indexer
vlorentz triaged T4719: indexer storage crashes on kafka errors, because of integers in the key as Normal priority.
Wed, Dec 7, 2:08 PM · Indexer
vlorentz requested review of D8932: Replace RunAll with RunExportCompressUpload.
Wed, Dec 7, 12:55 PM
vlorentz accepted D8933: task add: Ensure task type provided exist and raise otherwise.
Wed, Dec 7, 12:55 PM
vlorentz requested review of D8931: Prevent incorrect warning from being printed to output files.
Wed, Dec 7, 12:54 PM
vlorentz committed rDGRPH66253a872d6b: Add missing dependency on pytest-mock (authored by vlorentz).
Add missing dependency on pytest-mock
Wed, Dec 7, 12:47 PM
vlorentz closed D8930: origin_contributors: Fix typo and improve readability.
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPHb8ddd6ceadbd: origin_contributors: Fix typo and improve readability (authored by vlorentz).
origin_contributors: Fix typo and improve readability
Wed, Dec 7, 10:40 AM
vlorentz closed D8910: Regenerate the test dataset to include a release with no author.
Wed, Dec 7, 10:40 AM
vlorentz closed D8919: Add CLI script to generate Luigi config and call it.
Wed, Dec 7, 10:40 AM
vlorentz closed D8917: Split swh/graph/luigi.py into modules.
Wed, Dec 7, 10:40 AM
vlorentz closed D8912: ListOriginContributors: Ignore null author/committer in revisions/releases.
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPHe65858a73918: Split swh/graph/luigi.py into modules (authored by vlorentz).
Split swh/graph/luigi.py into modules
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPHb76801259953: Add CLI script to generate Luigi config and call it (authored by vlorentz).
Add CLI script to generate Luigi config and call it
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPHdfd4c1dc3b22: ListOriginContributors: Ignore null author/committer in revisions/releases (authored by vlorentz).
ListOriginContributors: Ignore null author/committer in revisions/releases
Wed, Dec 7, 10:40 AM
vlorentz closed D8908: Add ListOriginContributors.
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPHab2703efcb9a: Add Luigi task TopoSort and add a simple test (authored by vlorentz).
Add Luigi task TopoSort and add a simple test
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPHf3235e318485: Add ListOriginContributors (authored by vlorentz).
Add ListOriginContributors
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPH559d4068bfe1: Regenerate the test dataset to include a release with no author (authored by vlorentz).
Regenerate the test dataset to include a release with no author
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPH7bee5d47a6eb: revert multithreading, it's actually twice as slow as singlethread (authored by vlorentz).
revert multithreading, it's actually twice as slow as singlethread
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPH58f44785816b: Improve comments (authored by vlorentz).
Improve comments
Wed, Dec 7, 10:40 AM
vlorentz committed rDGRPH922894410b6e: Add a sample of two ancestor with each node (authored by vlorentz).
Add a sample of two ancestor with each node
Wed, Dec 7, 10:40 AM
vlorentz closed D8883: Add a script to generate a topological sort.
Wed, Dec 7, 10:39 AM
vlorentz committed rDGRPH30dad16a2365: tentative multithread DFS (authored by vlorentz).
tentative multithread DFS
Wed, Dec 7, 10:39 AM
vlorentz committed rDGRPHed6636c26be8: Implement a naive topological sort (authored by vlorentz).
Implement a naive topological sort
Wed, Dec 7, 10:39 AM
vlorentz closed D8903: luigi: Add tasks UploadGraphToS3 and DownloadGraphFromS3.
Wed, Dec 7, 10:39 AM
vlorentz committed rDGRPHb8dc411ccd30: luigi: Add tasks UploadGraphToS3 and DownloadGraphFromS3 (authored by vlorentz).
luigi: Add tasks UploadGraphToS3 and DownloadGraphFromS3
Wed, Dec 7, 10:39 AM
vlorentz added a comment to D8908: Add ListOriginContributors.

fixed by D8930

Wed, Dec 7, 10:09 AM
vlorentz updated the diff for D8919: Add CLI script to generate Luigi config and call it.

rebase + fix typos + improve readability

Wed, Dec 7, 10:08 AM
vlorentz updated the diff for D8917: Split swh/graph/luigi.py into modules.

rebase

Wed, Dec 7, 10:08 AM
vlorentz updated the diff for D8912: ListOriginContributors: Ignore null author/committer in revisions/releases.

rebase

Wed, Dec 7, 10:08 AM
vlorentz updated the diff for D8910: Regenerate the test dataset to include a release with no author.

rebase

Wed, Dec 7, 10:08 AM
vlorentz updated the diff for D8908: Add ListOriginContributors.

rebase

Wed, Dec 7, 10:08 AM
vlorentz updated the diff for D8883: Add a script to generate a topological sort.

rebase

Wed, Dec 7, 10:08 AM
vlorentz updated the diff for D8903: luigi: Add tasks UploadGraphToS3 and DownloadGraphFromS3.

rebase

Wed, Dec 7, 10:08 AM
vlorentz closed D8926: luigi.RunExportAll: Default to exporting all formats.
Wed, Dec 7, 10:03 AM
vlorentz closed D8925: luigi.CreateAthena: Fix validation of DB name.
Wed, Dec 7, 10:03 AM
vlorentz committed rDDATASETeceaf73f0fba: luigi.CreateAthena: Fix validation of DB name (authored by vlorentz).
luigi.CreateAthena: Fix validation of DB name
Wed, Dec 7, 10:03 AM
vlorentz committed rDDATASETc717f60fe08e: luigi.RunExportAll: Default to exporting all formats (authored by vlorentz).
luigi.RunExportAll: Default to exporting all formats
Wed, Dec 7, 10:03 AM
vlorentz closed D8924: exporters/orc: Fix crash on visit status with no type.
Wed, Dec 7, 10:03 AM
vlorentz committed rDDATASET22f7ed11f688: exporters/orc: Fix crash on visit status with no type (authored by vlorentz).
exporters/orc: Fix crash on visit status with no type
Wed, Dec 7, 10:02 AM
vlorentz added inline comments to D8908: Add ListOriginContributors.
Wed, Dec 7, 9:45 AM
vlorentz closed T1345: Update metadata docs about using CodeMeta vocabulary as Resolved.
Wed, Dec 7, 6:20 AM · Documentation
vlorentz closed T1345: Update metadata docs about using CodeMeta vocabulary, a subtask of T1649: Update documentation with compliance scenario changes, as Resolved.
Wed, Dec 7, 6:20 AM · SWORD deposit
vlorentz added a comment to T1345: Update metadata docs about using CodeMeta vocabulary.

yes

Wed, Dec 7, 6:20 AM · Documentation

Tue, Dec 6

vlorentz added a comment to D8907: feat: Add Hex.pm lister.

order sounds best. Do you want to do it?

Tue, Dec 6, 6:18 PM
vlorentz added a comment to T4394: Add support for running metadata fetchers without a VCS/package loaders.

We decided to add recurring fetches, so it will take care both of backfilling now, and visiting from time to time in the future. We're going to assume 3 months for now, as it seems reasonable to not exhaust rate limits.

Tue, Dec 6, 3:54 PM · Extrinsic metadata
vlorentz added a revision to T2220: swh-graph in production: D8919: Add CLI script to generate Luigi config and call it.
Tue, Dec 6, 2:37 PM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
vlorentz added a task to D8919: Add CLI script to generate Luigi config and call it: T2220: swh-graph in production.
Tue, Dec 6, 2:37 PM
vlorentz added a task to D8919: Add CLI script to generate Luigi config and call it: T4676: Add Luigi workflow in swh-dataset.
Tue, Dec 6, 2:37 PM
vlorentz added a task to D8924: exporters/orc: Fix crash on visit status with no type: T4676: Add Luigi workflow in swh-dataset.
Tue, Dec 6, 2:37 PM
vlorentz added a task to D8925: luigi.CreateAthena: Fix validation of DB name: T4676: Add Luigi workflow in swh-dataset.
Tue, Dec 6, 2:37 PM
vlorentz added a task to D8926: luigi.RunExportAll: Default to exporting all formats: T4676: Add Luigi workflow in swh-dataset.
Tue, Dec 6, 2:37 PM
vlorentz added revisions to T4676: Add Luigi workflow in swh-dataset: D8919: Add CLI script to generate Luigi config and call it, D8924: exporters/orc: Fix crash on visit status with no type, D8925: luigi.CreateAthena: Fix validation of DB name, D8926: luigi.RunExportAll: Default to exporting all formats.
Tue, Dec 6, 2:37 PM · Datasets, Compressed graph service
vlorentz requested review of D8926: luigi.RunExportAll: Default to exporting all formats.
Tue, Dec 6, 2:07 PM
vlorentz requested review of D8925: luigi.CreateAthena: Fix validation of DB name.
Tue, Dec 6, 2:05 PM
vlorentz requested review of D8924: exporters/orc: Fix crash on visit status with no type.
Tue, Dec 6, 2:04 PM
vlorentz accepted D8923: archive_coverage: Add link to Archive Changelog in coverage widget.

nice

Tue, Dec 6, 1:46 PM
vlorentz accepted D8920: from_disk.Content: Add missing path info for symlink.

ah, so it doesn't matter for other loaders. Phew!

Tue, Dec 6, 1:36 PM
vlorentz added a comment to D8920: from_disk.Content: Add missing path info for symlink.

Does it mean we were silently dropping paths until this? Which loaders use this?

Tue, Dec 6, 12:08 PM

Mon, Dec 5

vlorentz added a comment to D8918: gitlab: allow ignoring projects with certain path prefixes.

Could you add this check?

Mon, Dec 5, 4:24 PM
vlorentz requested review of D8919: Add CLI script to generate Luigi config and call it.
Mon, Dec 5, 3:53 PM
vlorentz requested review of D8917: Split swh/graph/luigi.py into modules.
Mon, Dec 5, 2:53 PM
vlorentz requested review of D8877: Fix incorrect error messages when failing to connect.
Mon, Dec 5, 1:50 PM
vlorentz triaged T4714: Write Luigi tasks to generate the citation dataset as Normal priority.
Mon, Dec 5, 10:51 AM · Datasets
vlorentz triaged T4713: Generate the citation dataset as Normal priority.
Mon, Dec 5, 10:51 AM · Datasets
vlorentz updated the task description for T4712: Write Luigi tasks to regenerate the license dataset.
Mon, Dec 5, 10:50 AM · Datasets