Page MenuHomeSoftware Heritage
Feed All Stories

Oct 17 2022

douardda closed D8678: Add a 'swh provenance replay' cli command.
Oct 17 2022, 6:08 PM
douardda committed rDPROV022b6f76614e: Add a 'swh provenance replay' cli command (authored by douardda).
Add a 'swh provenance replay' cli command
Oct 17 2022, 6:08 PM
vlorentz accepted D8689: docs: Add info about CPAN extrinsic metadata format.
Oct 17 2022, 6:07 PM
anlambert requested review of D8689: docs: Add info about CPAN extrinsic metadata format.
Oct 17 2022, 5:59 PM
vlorentz accepted D8688: model: Fix hypothesis integration with attr < 21.3.0.
Oct 17 2022, 5:56 PM
vlorentz added a revision to T4637: Document/showcase examples gRPC clients of the swh-graph : D8691: docs/grpc-api.rst: Add Python examples.
Oct 17 2022, 5:54 PM · Documentation, Compressed graph service
Harbormaster failed remote builds in B32333: Diff 31385 for D8690: docs/grpc-api.rst: Update to match to current code!
Oct 17 2022, 5:54 PM
Harbormaster failed to build B32332: rDGRPHb5ea368cd4a2: Merge branch 'import-from-license-dataset' for rDGRPHb5ea368cd4a2: Merge branch 'import-from-license-dataset'!
Oct 17 2022, 5:54 PM
Harbormaster failed to build B32331: rDGRPHe5574bccb4d9: FindEarliestRevision: Add earliest_ts and rev_occurrences columns for rDGRPHe5574bccb4d9: FindEarliestRevision: Add earliest_ts and rev_occurrences columns!
Oct 17 2022, 5:54 PM
swh-public-ci added a comment to D8690: docs/grpc-api.rst: Update to match to current code.

Build has FAILED

Oct 17 2022, 5:54 PM
vlorentz published D8690: docs/grpc-api.rst: Update to match to current code for review.
Oct 17 2022, 5:53 PM
vlorentz committed rDGRPHb5ea368cd4a2: Merge branch 'import-from-license-dataset' (authored by vlorentz).
Merge branch 'import-from-license-dataset'
Oct 17 2022, 5:52 PM
vlorentz closed D8662: FindEarliestRevision: Add earliest_ts and rev_occurrences columns.
Oct 17 2022, 5:52 PM
vlorentz committed rDGRPHe5574bccb4d9: FindEarliestRevision: Add earliest_ts and rev_occurrences columns (authored by vlorentz).
FindEarliestRevision: Add earliest_ts and rev_occurrences columns
Oct 17 2022, 5:52 PM
anlambert added a revision to T2833: cpan.loader - archive Perl modules from CPAN: D8689: docs: Add info about CPAN extrinsic metadata format.
Oct 17 2022, 5:43 PM · CPAN lister, Archive coverage
olasd committed rSPSITE35918e257de5: Give swhworker, olasd and ardumont direct access to tate for the GitLab… (authored by olasd).
Give swhworker, olasd and ardumont direct access to tate for the GitLab…
Oct 17 2022, 5:37 PM
vlorentz added a comment to D8678: Add a 'swh provenance replay' cli command.

I see, it makes sense then.

Oct 17 2022, 5:35 PM
anlambert requested review of D8688: model: Fix hypothesis integration with attr < 21.3.0.
Oct 17 2022, 5:32 PM
anlambert closed D8652: cpan: Collect extrinsic metadata for each module release.
Oct 17 2022, 5:32 PM
anlambert committed rDLDBASE85963318aab6: cpan: Collect extrinsic metadata for each module release (authored by anlambert).
cpan: Collect extrinsic metadata for each module release
Oct 17 2022, 5:32 PM
anlambert closed D8651: cpan: Do not parse intrinsic metadata for getting module author.
Oct 17 2022, 5:32 PM
anlambert committed rDLDBASE7b929606a78f: cpan: Do not parse intrinsic metadata for getting module author (authored by anlambert).
cpan: Do not parse intrinsic metadata for getting module author
Oct 17 2022, 5:32 PM
olasd committed R263:2aff3cadfa49: Implement `default-branch-name` in vc-repository model (authored by olasd).
Implement `default-branch-name` in vc-repository model
Oct 17 2022, 5:26 PM
anlambert closed D8686: merkle: Make MerkleNode.collect return a set of nodes instead of a dict.
Oct 17 2022, 5:20 PM
anlambert closed T4633: Make MerkleNode.collect return a set of MerkleNode instead of a dict as Resolved by committing rDMOD13e7adc3e854: merkle: Make MerkleNode.collect return a set of nodes instead of a dict.
Oct 17 2022, 5:20 PM · Data Model
anlambert committed rDMOD13e7adc3e854: merkle: Make MerkleNode.collect return a set of nodes instead of a dict (authored by anlambert).
merkle: Make MerkleNode.collect return a set of nodes instead of a dict
Oct 17 2022, 5:20 PM
douardda added a comment to D8678: Add a 'swh provenance replay' cli command.

Bikeshedding: it should be called "journal-client" rather than "replay" for consistency with swh-indexer and swh-search. (swh-storage only calls it "replay" because it's used to copy from another instance of the same code so it "replays" the same API calls; but here it may be the first "play")

I'm not sure I follow you there; this really is a replayer feature: it aims at replicating a provenance DB via a kafka journal.
We already have a journal client in provenance consuming the main archive revision and origin-visit-status topics. The cli are swh provenance revision from-journal and swh provenance origin from-journal (aka execute the {origin,revision} layer reading from the journal; there are from-csv versions of these commands as well).

Oct 17 2022, 5:20 PM
douardda added a comment to D8678: Add a 'swh provenance replay' cli command.

Bikeshedding: it should be called "journal-client" rather than "replay" for consistency with swh-indexer and swh-search. (swh-storage only calls it "replay" because it's used to copy from another instance of the same code so it "replays" the same API calls; but here it may be the first "play")

Oct 17 2022, 5:17 PM
ardumont moved T4614: Deploy swh-search v0.16.4 from in-progress to deployed/landed/monitoring on the System administration board.
Oct 17 2022, 4:57 PM · System administration, Archive search
ardumont changed the status of T4614: Deploy swh-search v0.16.4 from Open to Work in Progress.
Oct 17 2022, 4:57 PM · System administration, Archive search
ardumont changed the status of T4614: Deploy swh-search v0.16.4, a subtask of T4599: Github descriptions are not used to search origins, from Open to Work in Progress.
Oct 17 2022, 4:57 PM · Metadata workflow, Archive search
ardumont updated the task description for T4614: Deploy swh-search v0.16.4.
Oct 17 2022, 4:57 PM · System administration, Archive search
olasd added a project to T4617: Test task please ignore: Test tag please ignore.

This project has migrated to GitLab

Oct 17 2022, 4:27 PM
olasd removed a project from T4617: Test task please ignore: Test tag please ignore.
Oct 17 2022, 4:27 PM
olasd added a project to T4617: Test task please ignore: Test tag please ignore.
Oct 17 2022, 4:17 PM
olasd created Test tag please ignore.
Oct 17 2022, 4:17 PM
ardumont closed D8687: Add migration node.
Oct 17 2022, 4:07 PM
ardumont committed rSPREeced918d82ba: Add migration node (authored by ardumont).
Add migration node
Oct 17 2022, 4:07 PM
ardumont updated the diff for D8687: Add migration node.

Rebase

Oct 17 2022, 4:06 PM
vsellier accepted D8687: Add migration node.

LGTM

Oct 17 2022, 3:53 PM
ardumont updated the diff for D8687: Add migration node.

Amend

Oct 17 2022, 3:50 PM
ardumont updated the diff for D8687: Add migration node.

Amend

Oct 17 2022, 3:47 PM
ardumont requested review of D8687: Add migration node.
Oct 17 2022, 3:46 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCRUB2deb6bcb2db4: Updated backport on buster-swh from debian/0.1.1-1_swh1 (unstable-swh) (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Updated backport on buster-swh from debian/0.1.1-1_swh1 (unstable-swh)
Oct 17 2022, 3:12 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCRUB09873dc192a9: Merge tag 'debian/0.1.1-1_swh1' into debian/buster-swh (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Merge tag 'debian/0.1.1-1_swh1' into debian/buster-swh
Oct 17 2022, 3:12 PM
vlorentz updated the task description for T4639: Deploy swh-scrubber v0.1.1.
Oct 17 2022, 3:11 PM · System administration, Datastore Scrubber
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCRUBa624f9daf326: pristine-tar data for swh-scrubber_0.1.1.orig.tar.gz (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
pristine-tar data for swh-scrubber_0.1.1.orig.tar.gz
Oct 17 2022, 3:10 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCRUB3e7151f10a28: Update upstream source from tag 'debian/upstream/0.1.1' (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Update upstream source from tag 'debian/upstream/0.1.1'
Oct 17 2022, 3:10 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCRUB50787f545c31: Updated debian changelog for version 0.1.1 (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Updated debian changelog for version 0.1.1
Oct 17 2022, 3:10 PM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCRUB899dcdc18c59: New upstream version 0.1.1 (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
New upstream version 0.1.1
Oct 17 2022, 3:10 PM
vlorentz triaged T4639: Deploy swh-scrubber v0.1.1 as High priority.
Oct 17 2022, 3:10 PM · System administration, Datastore Scrubber
vlorentz closed D8609: storage_checker: Notify database when ranges are fully checked.
Oct 17 2022, 3:07 PM
vlorentz committed rDSCRUB84fa17c00be8: storage_checker: Notify database when ranges are fully checked (authored by vlorentz).
storage_checker: Notify database when ranges are fully checked
Oct 17 2022, 3:07 PM
vlorentz closed D8641: storage_checker: Do not re-check ranges already marked as checked.
Oct 17 2022, 3:07 PM
vlorentz committed rDSCRUB282242873716: storage_checker: Do not re-check ranges already marked as checked (authored by vlorentz).
storage_checker: Do not re-check ranges already marked as checked
Oct 17 2022, 3:07 PM
vlorentz accepted D8686: merkle: Make MerkleNode.collect return a set of nodes instead of a dict.
Oct 17 2022, 2:58 PM
vlorentz accepted D8652: cpan: Collect extrinsic metadata for each module release.
Oct 17 2022, 2:56 PM
anlambert added a comment to D8682: Improve CVS loader performances.

That's a surprisingly small diff for such a change, nice!

What speedup do you get with this?

Oct 17 2022, 2:49 PM
olasd committed R263:f38151294482: Repository archival is stored two different ways on phabricator (authored by olasd).
Repository archival is stored two different ways on phabricator
Oct 17 2022, 2:45 PM
lunar added a comment to D8671: Add a job running swh-mirror tests.

Screenshot of resources usage while test runs on my laptop:

Oct 17 2022, 2:40 PM
lunar updated the summary of D8671: Add a job running swh-mirror tests.
Oct 17 2022, 2:22 PM
ardumont updated the task description for T4614: Deploy swh-search v0.16.4.
Oct 17 2022, 2:12 PM · System administration, Archive search
swh-public-ci added a comment to D8652: cpan: Collect extrinsic metadata for each module release.

Build is green

Oct 17 2022, 2:10 PM
anlambert updated the diff for D8652: cpan: Collect extrinsic metadata for each module release.

Update: s/cpan-module-json/cpan-release-json/

Oct 17 2022, 2:06 PM
swh-public-ci added a comment to D8686: merkle: Make MerkleNode.collect return a set of nodes instead of a dict.

Build is green

Oct 17 2022, 2:06 PM
anlambert updated the diff for D8686: merkle: Make MerkleNode.collect return a set of nodes instead of a dict.

Update:

  • use hash builtin instead of adding a new hash_to_int method
  • update tests
Oct 17 2022, 2:03 PM
ardumont updated the task description for T4614: Deploy swh-search v0.16.4.
Oct 17 2022, 2:01 PM · System administration, Archive search
anlambert added inline comments to D8686: merkle: Make MerkleNode.collect return a set of nodes instead of a dict.
Oct 17 2022, 1:39 PM
vlorentz added a comment to D8682: Improve CVS loader performances.

That's a surprisingly small diff for such a change, nice!

Oct 17 2022, 1:34 PM
vlorentz added inline comments to D8686: merkle: Make MerkleNode.collect return a set of nodes instead of a dict.
Oct 17 2022, 1:29 PM
anlambert added a comment to D8652: cpan: Collect extrinsic metadata for each module release.
Oct 17 2022, 1:26 PM
vlorentz accepted D8678: Add a 'swh provenance replay' cli command.

Bikeshedding: it should be called "journal-client" rather than "replay" for consistency with swh-indexer and swh-search. (swh-storage only calls it "replay" because it's used to copy from another instance of the same code so it "replays" the same API calls; but here it may be the first "play")

Oct 17 2022, 1:13 PM
vlorentz added a comment to D8652: cpan: Collect extrinsic metadata for each module release.

Please update https://docs.softwareheritage.org/devel/swh-storage/extrinsic-metadata-specification.html#extrinsic-metadata-formats when landing this

Oct 17 2022, 1:09 PM
vlorentz added inline comments to D8652: cpan: Collect extrinsic metadata for each module release.
Oct 17 2022, 1:08 PM
vlorentz accepted D8651: cpan: Do not parse intrinsic metadata for getting module author.
Oct 17 2022, 1:06 PM
vlorentz edited P1500 (An Untitled Masterwork).
Oct 17 2022, 12:41 PM
vlorentz created P1500 (An Untitled Masterwork).
Oct 17 2022, 12:40 PM
Harbormaster failed remote builds in B32317: Diff 31369 for D8077: Add a static query cost calculator to reject malicious quries!
Oct 17 2022, 12:32 PM
swh-public-ci added a comment to D8077: Add a static query cost calculator to reject malicious quries.

Build has FAILED

Oct 17 2022, 12:32 PM
jayeshv updated the diff for D8077: Add a static query cost calculator to reject malicious quries.

rebase

Oct 17 2022, 12:29 PM
olasd committed R263:25762bcbc547: Enable ssh's control socket persistence (authored by olasd).
Enable ssh's control socket persistence
Oct 17 2022, 12:20 PM
olasd committed R263:d953a637115d: Assume repositories with a custom view policy are private (authored by olasd).
Assume repositories with a custom view policy are private
Oct 17 2022, 12:20 PM
olasd committed R263:1b3d150274e3: Don't force push on the master branch, which is protected (authored by olasd).
Don't force push on the master branch, which is protected
Oct 17 2022, 12:20 PM
olasd committed R263:1379ec21d7f7: Actually perform repository archival at the end of the migration (authored by olasd).
Actually perform repository archival at the end of the migration
Oct 17 2022, 12:20 PM
anlambert added inline comments to D8686: merkle: Make MerkleNode.collect return a set of nodes instead of a dict.
Oct 17 2022, 12:04 PM
vlorentz updated the task description for T4637: Document/showcase examples gRPC clients of the swh-graph .
Oct 17 2022, 11:48 AM · Documentation, Compressed graph service
vlorentz raised the priority of T4637: Document/showcase examples gRPC clients of the swh-graph from Normal to High.
Oct 17 2022, 11:48 AM · Documentation, Compressed graph service
vlorentz claimed T4637: Document/showcase examples gRPC clients of the swh-graph .
Oct 17 2022, 11:48 AM · Documentation, Compressed graph service
vlorentz triaged T4637: Document/showcase examples gRPC clients of the swh-graph as Normal priority.
Oct 17 2022, 11:48 AM · Documentation, Compressed graph service
anlambert closed Restricted Maniphest Task, a subtask of T4625: staging: ingest netbsd.org cvs forge, as Resolved.
Oct 17 2022, 10:55 AM · System administration, Archive coverage
anlambert closed D8684: rlog: Skip rlog entry with missing header in RlogConv.parse_rlog.
Oct 17 2022, 10:55 AM
anlambert committed rDLDCVS734207ba5847: rlog: Skip rlog entry with missing header in RlogConv.parse_rlog (authored by anlambert).
rlog: Skip rlog entry with missing header in RlogConv.parse_rlog
Oct 17 2022, 10:55 AM
swh-public-ci added a comment to D8684: rlog: Skip rlog entry with missing header in RlogConv.parse_rlog.

Build is green

Oct 17 2022, 10:54 AM
anlambert updated the diff for D8684: rlog: Skip rlog entry with missing header in RlogConv.parse_rlog.

Rebase

Oct 17 2022, 10:51 AM
anlambert closed D8683: loader, cvsclient: Read files line by line to reduce memory consumption.
Oct 17 2022, 10:50 AM
anlambert committed rDLDCVScfe7507a7366: loader, cvsclient: Read files line by line to reduce memory consumption (authored by anlambert).
loader, cvsclient: Read files line by line to reduce memory consumption
Oct 17 2022, 10:50 AM
vlorentz claimed T4636: Show indexed intrinsic metadata on web UI.
Oct 17 2022, 10:38 AM · Web app, Metadata workflow
vlorentz placed T4634: Show raw extrinsic metadata on web UI up for grabs.
Oct 17 2022, 10:38 AM · Web app, Metadata workflow
vlorentz placed T4636: Show indexed intrinsic metadata on web UI up for grabs.
Oct 17 2022, 10:37 AM · Web app, Metadata workflow
vlorentz triaged T4636: Show indexed intrinsic metadata on web UI as Low priority.
Oct 17 2022, 10:37 AM · Web app, Metadata workflow