Page MenuHomeSoftware Heritage
Feed Advanced Search

Oct 14 2021

vlorentz added a parent task for T1957: Handling missing DAG nodes: T3134: SWHID v2.
Oct 14 2021, 12:12 PM · Data Model
olasd added a comment to T1957: Handling missing DAG nodes.

In SWHIDv2, instead of having a hardcoded "pointer to another revision" directory entry type, we could enable pointers to more generic "unresolved external entities". When possible, we should make these pointers compatible with the current ExtID table, so that users of the data can look the contents of the pointed objects up lazily.

Oct 14 2021, 12:06 PM · Data Model

Oct 11 2021

vlorentz updated the task description for T3595: Support disordered directory entries in git.
Oct 11 2021, 2:49 PM · meta-task, Data Model, Storage manager

Oct 8 2021

vlorentz added a subtask for T3638: Make package loaders create releases objects instead of revisions: T3636: Make the opam loader write extrinsic metadata.
Oct 8 2021, 2:32 PM · Package Loader, Data Model, Archive content
vlorentz added projects to T3638: Make package loaders create releases objects instead of revisions: Data Model, Package Loader.
Oct 8 2021, 2:30 PM · Package Loader, Data Model, Archive content

Oct 4 2021

douardda added a comment to T3611: Define the mapping for Bazaar repositories/branches to the SWH data model.

Ideally this doc would (briefly) describe how bazaar works and how it is different from already supported DVCS, then document chosen the "mapping" of the bzr model into swh (especially mentioning what is lost during this).

Oct 4 2021, 11:43 AM · Data Model, BZR loader
douardda added a comment to T3611: Define the mapping for Bazaar repositories/branches to the SWH data model.

Would it be possible to add a "conception documentation" included in the docs/ of the BZR loader repo? (possibly with D6344 or as a standalone diff)?

Oct 4 2021, 10:48 AM · Data Model, BZR loader

Sep 28 2021

Alphare added a comment to T3611: Define the mapping for Bazaar repositories/branches to the SWH data model.

The conclusions in the meeting were as follows:

Sep 28 2021, 5:02 PM · Data Model, BZR loader

Sep 27 2021

vlorentz added a project to T3611: Define the mapping for Bazaar repositories/branches to the SWH data model: Data Model.
Sep 27 2021, 1:23 PM · Data Model, BZR loader

Sep 24 2021

vlorentz added a parent task for T3594: Faithfully store weird git objects: T3552: Fix corrupted releases, revisions, and directories in the storage.
Sep 24 2021, 3:13 PM · meta-task, Data Model, Storage manager

Sep 23 2021

vlorentz updated the task description for T3609: SWHIDv2: List issues with SWHIDv1 that should be fixed.
Sep 23 2021, 5:01 PM · Roadmap 2020, Data Model, Web app, Roadmap 2021
vlorentz triaged T3609: SWHIDv2: List issues with SWHIDv1 that should be fixed as Normal priority.
Sep 23 2021, 5:00 PM · Roadmap 2020, Data Model, Web app, Roadmap 2021
vlorentz added a parent task for T3607: Document consistency guarantees of the loaders with respect to the storage: T3604: Document the architecture of all major packages/components.
Sep 23 2021, 3:00 PM · Data Model, Storage manager, Package Loader, Core Loader, Documentation
vlorentz triaged T3607: Document consistency guarantees of the loaders with respect to the storage as Normal priority.
Sep 23 2021, 3:00 PM · Data Model, Storage manager, Package Loader, Core Loader, Documentation

Sep 22 2021

vlorentz triaged T3598: Support revisions with "extra headers" not at the end as Low priority.
Sep 22 2021, 4:00 PM · Data Model, Storage manager
vlorentz added a comment to T3596: Support "weird" permissions in directories.

Complete proposal for the above solution:

Sep 22 2021, 2:56 PM · meta-task, Data Model, Storage manager
vlorentz added a comment to T3595: Support disordered directory entries in git.

Complete proposal to implement the above solution:

Sep 22 2021, 2:51 PM · meta-task, Data Model, Storage manager
vlorentz updated the task description for T3586: Figure out what to do with 'misordered' directories in Cassandra.
Sep 22 2021, 1:44 PM · Data Model, Storage manager
vlorentz updated the task description for T3594: Faithfully store weird git objects.
Sep 22 2021, 1:42 PM · meta-task, Data Model, Storage manager
vlorentz added a comment to T3596: Support "weird" permissions in directories.

Possible solution: store them as an ascii string instead of an integer.

Sep 22 2021, 1:38 PM · meta-task, Data Model, Storage manager
vlorentz added a comment to T3595: Support disordered directory entries in git.

Possible solution: store a rank along with each directory entry, but ignore it unless we are reconstructing a git object or computing a SWHID (v1?)

Sep 22 2021, 1:37 PM · meta-task, Data Model, Storage manager
vlorentz triaged T3596: Support "weird" permissions in directories as Normal priority.
Sep 22 2021, 1:36 PM · meta-task, Data Model, Storage manager
vlorentz updated the task description for T3595: Support disordered directory entries in git.
Sep 22 2021, 1:34 PM · meta-task, Data Model, Storage manager
vlorentz triaged T3595: Support disordered directory entries in git as Normal priority.
Sep 22 2021, 1:34 PM · meta-task, Data Model, Storage manager
vlorentz triaged T3594: Faithfully store weird git objects as Normal priority.
Sep 22 2021, 1:31 PM · meta-task, Data Model, Storage manager

Sep 17 2021

vlorentz updated the task description for T3586: Figure out what to do with 'misordered' directories in Cassandra.
Sep 17 2021, 11:38 AM · Data Model, Storage manager
vlorentz removed a project from T3586: Figure out what to do with 'misordered' directories in Cassandra: meta-task.
Sep 17 2021, 11:37 AM · Data Model, Storage manager
vlorentz placed T3586: Figure out what to do with 'misordered' directories in Cassandra up for grabs.
Sep 17 2021, 11:37 AM · Data Model, Storage manager
vlorentz triaged T3586: Figure out what to do with 'misordered' directories in Cassandra as Normal priority.
Sep 17 2021, 11:37 AM · Data Model, Storage manager

Sep 3 2021

vlorentz closed T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage, a subtask of T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, as Resolved.
Sep 3 2021, 11:38 AM · Data Model, Storage manager, Extrinsic metadata

Aug 11 2021

vlorentz added a comment to T3214: Restrict accepted timestamps to values that can be processed all along.

I just gave this a shot, and I can't find a way to encode large timestamps in postgresql without losing microsecond precision.

Aug 11 2021, 12:39 PM · Data Model

Jul 30 2021

vlorentz added a project to T3282: Add support for "uninterpreted upstream object" in SWH model and storage: Data Model.
Jul 30 2021, 10:16 AM · Data Model

Jun 29 2021

vlorentz removed a project from T3316: SWHID v2: determine binary-to-text encoding for checksum part: Roadmap 2021.
Jun 29 2021, 10:51 AM · Data Model

Jun 21 2021

DanSeraf closed T3393: add swhid() method to from_disk classes as Resolved by committing rDMODe4566a6605ff: from_disk: get swhid from Content/Directory objects.
Jun 21 2021, 5:16 PM · Data Model

Jun 18 2021

DanSeraf added a revision to T3393: add swhid() method to from_disk classes: D5899: swh-model: get SWHID from Content/Directory objects in from_disk.
Jun 18 2021, 4:48 PM · Data Model
zack triaged T3393: add swhid() method to from_disk classes as Normal priority.
Jun 18 2021, 11:54 AM · Data Model

Jun 15 2021

DanSeraf closed T3383: swh identify --recursive breaks --exclude, resulting in a "AttributeError: 'str' object has no attribute 'decode'" traceback as Resolved by committing rDMODe09446a6f44b: encode exclude patterns before extracting regex objects.
Jun 15 2021, 6:28 PM · Data Model
anlambert added a revision to T2187: Origin URL duplicates due to caps and .git URL: D5877: assets/save: Ensure to use canonical github repo URL as origin URL.
Jun 15 2021, 5:58 PM · Data Model
DanSeraf added a revision to T3383: swh identify --recursive breaks --exclude, resulting in a "AttributeError: 'str' object has no attribute 'decode'" traceback: D5876: swh-model: encode exclude patterns before extracting regex objects.
Jun 15 2021, 5:42 PM · Data Model
zack triaged T3383: swh identify --recursive breaks --exclude, resulting in a "AttributeError: 'str' object has no attribute 'decode'" traceback as High priority.
Jun 15 2021, 4:48 PM · Data Model
ardumont added a comment to T2187: Origin URL duplicates due to caps and .git URL.

If we also implement it as a fallback on the backend side, we should find a way to determine if an input save code now request has been created from the Web UI or through a direct call to the Web API.

Jun 15 2021, 10:25 AM · Data Model

Jun 14 2021

anlambert added a comment to T2187: Origin URL duplicates due to caps and .git URL.

This should happen both client and server side (as fallback).

Jun 14 2021, 1:45 PM · Data Model

Jun 11 2021

DanSeraf closed T3160: swh identify: add a -R/--recursive flag as Resolved.

closed by https://forge.softwareheritage.org/D5825

Jun 11 2021, 4:42 PM · Easy hack, Data Model

Jun 8 2021

vlorentz added a revision to T3160: swh identify: add a -R/--recursive flag: D5420: cli/identify: Add support for --recursive.
Jun 8 2021, 3:28 PM · Easy hack, Data Model
vlorentz added a revision to T3160: swh identify: add a -R/--recursive flag: D5825: swh-model: add recursive option.
Jun 8 2021, 3:27 PM · Easy hack, Data Model

Jun 2 2021

ardumont added a comment to T2187: Origin URL duplicates due to caps and .git URL.

A recent discussion occurred on the #swh-devel irc channel about this issue. The gist of
it is that regarding github repositories (in the save code now [1]), the webapp should
be evolved to query the github api to determine the canonical url used for a repository
and use it as origin.

Jun 2 2021, 3:00 PM · Data Model

May 8 2021

zack updated the task description for T3316: SWHID v2: determine binary-to-text encoding for checksum part.
May 8 2021, 1:18 PM · Data Model
zack triaged T3316: SWHID v2: determine binary-to-text encoding for checksum part as Normal priority.
May 8 2021, 11:43 AM · Data Model
zack closed T2210: Data Model as Invalid.

Closing this as it was a vague meta-task from 2020 roadmap (but we'll keep the actual sub-tasks, which were more clearly identified and are still relevant).

May 8 2021, 11:37 AM · Data Model, Roadmap 2020

Apr 30 2021

anlambert added a revision to T3298: Consider making SWHID handling case insensitive: D5655: assets/webapp-utils: Add lowercase validator for core SWHIDs.
Apr 30 2021, 2:43 PM · Data Model, Web app
vlorentz added a revision to T3298: Consider making SWHID handling case insensitive: D5654: docs/persistent-identifiers: Add guidelines for fixing invalid SWHIDs (this time for uppercase).
Apr 30 2021, 12:57 PM · Data Model, Web app

Apr 29 2021

anlambert added a revision to T3298: Consider making SWHID handling case insensitive: D5649: identifiers: Add support for resolving core SWHID with uppercase chars.
Apr 29 2021, 5:41 PM · Data Model, Web app
rdicosmo added a comment to T3298: Consider making SWHID handling case insensitive.

So for SWHID v1, the resolver should turn the core part into lowercase , am I right ?

Apr 29 2021, 1:16 PM · Data Model, Web app
anlambert added a comment to T3298: Consider making SWHID handling case insensitive.

I'm not a fan of changing the spec of SWHID version 1 to make them case insensitive, as it seems to be a significant change (in particular for the code that checks for the syntactic correctness of IDs).

Apr 29 2021, 12:50 PM · Data Model, Web app
vlorentz added a project to T3298: Consider making SWHID handling case insensitive: Data Model.
Apr 29 2021, 12:28 PM · Data Model, Web app

Apr 23 2021

vlorentz assigned T3134: SWHID v2 to zack.
Apr 23 2021, 4:50 PM · Roadmap 2022, Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021
vlorentz updated the task description for T3284: Support for multiple revision authors?.
Apr 23 2021, 2:09 PM · Data Model
ardumont updated the task description for T3284: Support for multiple revision authors?.
Apr 23 2021, 1:34 PM · Data Model
vlorentz lowered the priority of T3284: Support for multiple revision authors? from Normal to Wishlist.
Apr 23 2021, 1:22 PM · Data Model
vlorentz triaged T3284: Support for multiple revision authors? as Normal priority.
Apr 23 2021, 1:22 PM · Data Model

Apr 22 2021

douardda added a subtask for T1957: Handling missing DAG nodes: T3282: Add support for "uninterpreted upstream object" in SWH model and storage.
Apr 22 2021, 2:44 PM · Data Model
douardda added a comment to T1957: Handling missing DAG nodes.

Examples of such missing objects are revisions with attributes that cannot fit the current data model, e.g. out of range dates. We have example of such revisions in kafka, as mentionned in T3200 and T3170.

Apr 22 2021, 2:39 PM · Data Model

Apr 21 2021

douardda added a comment to T3170: Revisions in the journal with out of range dates.

Note that none of their parent revisions can be found either in the archive (one invalid revision in a set of ingested revisions prevent any of them being inserted in the database I suppose, but they are already inserted in kafka at this moment).

Apr 21 2021, 7:08 PM · Data Model, Journal

Apr 15 2021

vlorentz closed T3226: swh identify with type=snapshot shows dependency not installed error as Resolved.
Apr 15 2021, 3:11 PM · Data Model, SWH command line interface

Apr 12 2021

vlorentz added a comment to T3235: Add archival of bug tracker databases as well as an unofficial bug tracker per-project.

You are likely doing a git pull on a periodic basis. Just add git bug bridge pull [<name>] next to it.

Apr 12 2021, 3:37 PM · Archive coverage, Data Model
libEqualizer added a comment to T3235: Add archival of bug tracker databases as well as an unofficial bug tracker per-project.

However, this would require considerable work

Apr 12 2021, 2:48 PM · Archive coverage, Data Model
vlorentz triaged T3235: Add archival of bug tracker databases as well as an unofficial bug tracker per-project as Wishlist priority.

Hi, thanks for the suggestion.

Apr 12 2021, 11:31 AM · Archive coverage, Data Model

Apr 8 2021

pawarhrishi21 added a comment to T3226: swh identify with type=snapshot shows dependency not installed error.

You should install swh.model[cli] instead of swh.model. I added a better error message in D5466 so it's clearer.

Apr 8 2021, 7:31 PM · Data Model, SWH command line interface
vlorentz added a comment to T3226: swh identify with type=snapshot shows dependency not installed error.

And I'm also updating the documentation at https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#computing

Apr 8 2021, 7:28 PM · Data Model, SWH command line interface
vlorentz added a revision to T3226: swh identify with type=snapshot shows dependency not installed error: D5469: docs: Ask readers to install swh.model[cli] to fully use swh-identify.
Apr 8 2021, 7:27 PM · Data Model, SWH command line interface
vlorentz added a comment to T3226: swh identify with type=snapshot shows dependency not installed error.

You should install swh.model[cli] instead of swh.model. I added a better error message in D5466 so it's clearer.

Apr 8 2021, 7:23 PM · Data Model, SWH command line interface
vlorentz added a revision to T3226: swh identify with type=snapshot shows dependency not installed error: D5466: swh-identify: Hide tracebacks if Click or Dulwich is not installed.
Apr 8 2021, 7:20 PM · Data Model, SWH command line interface
vlorentz triaged T3226: swh identify with type=snapshot shows dependency not installed error as Normal priority.
Apr 8 2021, 6:43 PM · Data Model, SWH command line interface
pawarhrishi21 created T3226: swh identify with type=snapshot shows dependency not installed error.
Apr 8 2021, 6:36 PM · Data Model, SWH command line interface
vlorentz closed T3220: Installing swh.model does not install its dependencies as Resolved.

Resolved by D5460; thanks again for the report

Apr 8 2021, 4:36 PM · Data Model, SWH command line interface
vlorentz added a project to T3220: Installing swh.model does not install its dependencies: Data Model.
Apr 8 2021, 4:18 PM · Data Model, SWH command line interface

Apr 7 2021

douardda added a comment to T3214: Restrict accepted timestamps to values that can be processed all along.

looks like there is no revision with date or committer_date > 9999-12-31 in the main storage...

Apr 7 2021, 3:04 PM · Data Model
douardda triaged T3214: Restrict accepted timestamps to values that can be processed all along as High priority.
Apr 7 2021, 2:30 PM · Data Model

Apr 6 2021

zack closed T1136: swh-identify: support recursive checksumming of directories as Invalid.

duplicate with T3160

Apr 6 2021, 11:36 AM · Data Model

Mar 26 2021

DanSeraf closed T2570: swh-identify: support exclusion patterns (e.g., for .git/) as swh-scanner does as Resolved.

Already implemented in D4193

Mar 26 2021, 3:15 PM · Data Model

Mar 24 2021

seirl updated the task description for T3170: Revisions in the journal with out of range dates.
Mar 24 2021, 6:56 PM · Data Model, Journal
seirl updated the task description for T3170: Revisions in the journal with out of range dates.
Mar 24 2021, 4:11 PM · Data Model, Journal
seirl updated the task description for T3170: Revisions in the journal with out of range dates.
Mar 24 2021, 4:11 PM · Data Model, Journal
seirl updated the task description for T3170: Revisions in the journal with out of range dates.
Mar 24 2021, 4:10 PM · Data Model, Journal
seirl triaged T3170: Revisions in the journal with out of range dates as Normal priority.
Mar 24 2021, 1:13 PM · Data Model, Journal

Mar 23 2021

vlorentz added a comment to T2686: Use hashes for all kafka keys.

(and we should keep the origin topic; we already have an ExtSWHID for origins anyway)

Mar 23 2021, 2:55 PM · Data Model, Storage manager
olasd added a comment to T2686: Use hashes for all kafka keys.

The following objects remain:

Mar 23 2021, 2:47 PM · Data Model, Storage manager
vlorentz closed T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, a subtask of T2686: Use hashes for all kafka keys, as Resolved.
Mar 23 2021, 2:33 PM · Data Model, Storage manager
vlorentz closed T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects as Resolved.
Mar 23 2021, 2:33 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz closed T3017: Use hashes as keys in swh.journal.objects.raw_extrinsic_metadata, a subtask of T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, as Resolved.
Mar 23 2021, 2:33 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz closed T3017: Use hashes as keys in swh.journal.objects.raw_extrinsic_metadata as Resolved.
Mar 23 2021, 2:33 PM · Data Model, Storage manager, Extrinsic metadata
olasd closed T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields, a subtask of T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, as Resolved.
Mar 23 2021, 2:25 PM · Data Model, Storage manager, Extrinsic metadata

Mar 20 2021

zack renamed T3160: swh identify: add a -R/--recursive flag from swh identify: add a -R/--recursive to swh identify: add a -R/--recursive flag.
Mar 20 2021, 2:22 PM · Easy hack, Data Model
zack updated the task description for T3160: swh identify: add a -R/--recursive flag.
Mar 20 2021, 2:21 PM · Easy hack, Data Model
zack triaged T3160: swh identify: add a -R/--recursive flag as Normal priority.
Mar 20 2021, 2:20 PM · Easy hack, Data Model

Mar 19 2021

vlorentz added a subtask for T2210: Data Model: T3134: SWHID v2.
Mar 19 2021, 4:23 PM · Data Model, Roadmap 2020
vlorentz added a parent task for T3134: SWHID v2: T2210: Data Model.
Mar 19 2021, 4:23 PM · Roadmap 2022, Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021
vlorentz triaged T3134: SWHID v2 as Normal priority.
Mar 19 2021, 4:22 PM · Roadmap 2022, Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021
vlorentz added a project to T3134: SWHID v2: Roadmap 2020.
Mar 19 2021, 4:21 PM · Roadmap 2022, Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021
vlorentz merged task T2212: Specification for swh:2+: identifiers into T3134: SWHID v2.
Mar 19 2021, 4:21 PM · Data Model, Roadmap 2020