Page MenuHomeSoftware Heritage
Feed Advanced Search

Nov 12 2020

ardumont added a revision to T2769: Make function swh.model.identifiers.parse_swhid more strict: D4462: identifiers.parse_swhid: Make SWHIDs with whitespaces invalid.
Nov 12 2020, 10:42 AM · Data Model
ardumont added a revision to T2769: Make function swh.model.identifiers.parse_swhid more strict: D4461: identifiers.parse_swhid: Check the swhid qualifiers and fail if invalid.
Nov 12 2020, 10:16 AM · Data Model

Nov 10 2020

ardumont added a revision to T2769: Make function swh.model.identifiers.parse_swhid more strict: D4458: model.identifiers: Improve error messages in case of invalid SWHIDs.
Nov 10 2020, 6:20 PM · Data Model
ardumont added a revision to T2769: Make function swh.model.identifiers.parse_swhid more strict: D4457: test: Migrate parse_swhid test cases to pytest.
Nov 10 2020, 4:13 PM · Data Model
ardumont triaged T2769: Make function swh.model.identifiers.parse_swhid more strict as Normal priority.
Nov 10 2020, 2:27 PM · Data Model

Nov 3 2020

moranegg moved T2636: Modify generation of full-context SWHID for root artifacts by omitting path classifier from Deployed to Archived on the SWORD deposit board.
Nov 3 2020, 10:17 AM · Web app, SWORD deposit, Data Model
moranegg closed T2636: Modify generation of full-context SWHID for root artifacts by omitting path classifier as Resolved.
Nov 3 2020, 10:16 AM · Web app, SWORD deposit, Data Model

Oct 26 2020

douardda closed T2421: Make model objects immutable as Resolved.

should be ok now (even if via ImmutableDict :-) )

Oct 26 2020, 2:51 PM · Data Model
douardda closed T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object as Resolved.
Oct 26 2020, 2:45 PM · Data Model
douardda closed T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object, a subtask of T2421: Make model objects immutable, as Resolved.
Oct 26 2020, 2:45 PM · Data Model
douardda updated the task description for T2423: Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object.
Oct 26 2020, 2:44 PM · Data Model

Oct 23 2020

olasd added a revision to T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: D4348: Rename the RawExtrinsicMetadata id field to target.
Oct 23 2020, 5:20 PM · Data Model, Storage manager, Extrinsic metadata

Oct 19 2020

olasd added a revision to T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: D4307: Update the HashableObject interface to take the object itself.
Oct 19 2020, 4:26 PM · Data Model, Storage manager, Extrinsic metadata
olasd added a revision to T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority: D4307: Update the HashableObject interface to take the object itself.
Oct 19 2020, 4:26 PM · Data Model, Storage manager, Extrinsic metadata
olasd added a revision to T2713: Strong typing of swh.model.identifiers arguments: D4307: Update the HashableObject interface to take the object itself.
Oct 19 2020, 4:26 PM · Data Model
olasd added a revision to T2715: Replace direct usage of swh.model.identifiers with use of swh.model.model objects: D4266: Use swh.model.model helpers to compute object identifiers.
Oct 19 2020, 11:41 AM · Data Model
olasd triaged T2715: Replace direct usage of swh.model.identifiers with use of swh.model.model objects as Low priority.
Oct 19 2020, 11:38 AM · Data Model
olasd triaged T2713: Strong typing of swh.model.identifiers arguments as Low priority.
Oct 19 2020, 11:37 AM · Data Model

Oct 16 2020

vlorentz added projects to T2666: GitHub releases not available in record: Data Model, Git loader.
Oct 16 2020, 2:28 PM · Git loader, Data Model

Oct 14 2020

ardumont moved T2636: Modify generation of full-context SWHID for root artifacts by omitting path classifier from Backlog to Deployed on the SWORD deposit board.
Oct 14 2020, 6:32 PM · Web app, SWORD deposit, Data Model
olasd added a comment to T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority.

This line of reasoning makes sense to me.

Oct 14 2020, 3:03 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz removed a parent task for T2686: Use hashes for all kafka keys: T2668: Package loaders should write extrinsic metadata on directories instead of revisions/releases.
Oct 14 2020, 2:08 PM · Data Model, Storage manager
vlorentz added a parent task for T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: T2686: Use hashes for all kafka keys.
Oct 14 2020, 2:08 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz added a parent task for T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority: T2686: Use hashes for all kafka keys.
Oct 14 2020, 2:08 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz added subtasks for T2686: Use hashes for all kafka keys: T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority.
Oct 14 2020, 2:08 PM · Data Model, Storage manager
vlorentz triaged T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority as High priority.
Oct 14 2020, 2:07 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz edited projects for T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, added: Data Model; removed Package Loader.
Oct 14 2020, 2:01 PM · Data Model, Storage manager, Extrinsic metadata

Oct 13 2020

ardumont updated the task description for T2686: Use hashes for all kafka keys.
Oct 13 2020, 8:51 AM · Data Model, Storage manager

Oct 12 2020

vlorentz updated the task description for T2686: Use hashes for all kafka keys.
Oct 12 2020, 1:07 PM · Data Model, Storage manager
vlorentz updated the task description for T2686: Use hashes for all kafka keys.
Oct 12 2020, 1:06 PM · Data Model, Storage manager
vlorentz added a parent task for T2686: Use hashes for all kafka keys: T2668: Package loaders should write extrinsic metadata on directories instead of revisions/releases.
Oct 12 2020, 1:06 PM · Data Model, Storage manager
vlorentz updated the task description for T2686: Use hashes for all kafka keys.
Oct 12 2020, 1:05 PM · Data Model, Storage manager
vlorentz added a parent task for T2686: Use hashes for all kafka keys: T2520: Setup dedicated kafka cluster on new rocquencourt hardware.
Oct 12 2020, 1:04 PM · Data Model, Storage manager
vlorentz triaged T2686: Use hashes for all kafka keys as Normal priority.
Oct 12 2020, 1:04 PM · Data Model, Storage manager

Sep 25 2020

anlambert added a revision to T2636: Modify generation of full-context SWHID for root artifacts by omitting path classifier : D4050: SWHIDs: Do not add path qualifier for a root directory.
Sep 25 2020, 2:43 PM · Web app, SWORD deposit, Data Model

Sep 24 2020

rdicosmo added a comment to T2636: Modify generation of full-context SWHID for root artifacts by omitting path classifier .

Yes please :-)

Sep 24 2020, 1:13 PM · Web app, SWORD deposit, Data Model
anlambert added a project to T2636: Modify generation of full-context SWHID for root artifacts by omitting path classifier : Web app.
Sep 24 2020, 12:52 PM · Web app, SWORD deposit, Data Model
anlambert added a comment to T2636: Modify generation of full-context SWHID for root artifacts by omitting path classifier .

Should this also be removed from the "Permalinks" tab in the webapp ?

Sep 24 2020, 10:25 AM · Web app, SWORD deposit, Data Model

Sep 23 2020

moranegg triaged T2636: Modify generation of full-context SWHID for root artifacts by omitting path classifier as High priority.
Sep 23 2020, 11:16 PM · Web app, SWORD deposit, Data Model
seirl updated the task description for T2633: Tighten restrictions on directory entry names.
Sep 23 2020, 2:22 PM · Data Model
seirl triaged T2633: Tighten restrictions on directory entry names as Normal priority.
Sep 23 2020, 2:21 PM · Data Model

Sep 11 2020

ardumont updated the task description for T2586: Investigate feasibility of revision diff fragments specification for SWHID.
Sep 11 2020, 3:02 PM · Data Model
anlambert triaged T2586: Investigate feasibility of revision diff fragments specification for SWHID as Normal priority.
Sep 11 2020, 3:01 PM · Data Model

Sep 8 2020

zack added a comment to T2571: swh-identify: add support for --type revision.

Re supported VCSs: sure, but I'd start with git that is a low-hanging fruit.

Sep 8 2020, 1:18 PM · Easy hack, Data Model
douardda added a comment to T2571: swh-identify: add support for --type revision.

do we mean something else than echo swh:1:rev:$(git rev-parse HEAD) ?
Do we want support for other (supported) VCS?

Sep 8 2020, 12:34 PM · Easy hack, Data Model
zack triaged T2571: swh-identify: add support for --type revision as Normal priority.
Sep 8 2020, 9:12 AM · Easy hack, Data Model
zack triaged T2570: swh-identify: support exclusion patterns (e.g., for .git/) as swh-scanner does as Normal priority.
Sep 8 2020, 9:09 AM · Data Model
zack closed T1687: Add filename as an optional part in persistent identifiers as Resolved.

This seems to have been addressed with the path qualifier in SWHIDs.
Closing.
(Please reopen if I'm missing something.)

Sep 8 2020, 9:06 AM · Data Model
zack updated subscribers of T1136: swh-identify: support recursive checksumming of directories.

As an update: a feature equivalent to this one has been implemented in swh-scanner by @DanSeraf.
I guess it would still be useful to have (as it seems like a natural need) also in swh-identify, but of course the code should not be duplicated.

Sep 8 2020, 9:04 AM · Data Model

Sep 4 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3883: algos.diff: Add missed revision_get conversion.
Sep 4 2020, 3:37 PM · Data Model, Storage manager

Sep 3 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3877: Adapt storage.revision_get calls according to latest api change.
Sep 3 2020, 5:48 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3870: Adapt storage.revision_get calls according to latest api change.
Sep 3 2020, 1:26 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3869: test_loader: Adapt to latest storage revision_get change.
Sep 3 2020, 1:20 PM · Data Model, Storage manager

Sep 2 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3868: loader: Adapt to latest storage revision_get change.
Sep 2 2020, 6:39 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3865: metadata: Adapt to latest storage revision_get change.
Sep 2 2020, 4:08 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3864: migrations: Adapt according to latest storage revision_get api change.
Sep 2 2020, 3:49 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3863: Refactor revision_get storage API to return Revision objects.
Sep 2 2020, 3:25 PM · Data Model, Storage manager

Aug 31 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3854: swh.web: Adapt to latest storage release_get api change.
Aug 31 2020, 4:44 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3853: test_loader: Adapt to latest storage release_get change.
Aug 31 2020, 3:49 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3852: storage*: release_get(...) -> List[Optional[Release]].
Aug 31 2020, 3:42 PM · Data Model, Storage manager

Aug 28 2020

vlorentz added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Update: I just tried out my script to use Disarchive as an alternative to pristine-tar. It was pretty easy since they both work similarly (from the outside, ofc).

Aug 28 2020, 10:20 PM · Data Model

Aug 27 2020

vlorentz added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@samplet so it's an improvement over pristine-tar, great!

Aug 27 2020, 11:33 PM · Data Model
samplet added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@lewo Great! The code is still “technical preview” quality, so there are some rough edges. Also, I’m willing to make architectural changes to better suit SWH. For instance, the whole “database” model might not be necessary if we want to store the metadata along with the files themselves. Feel free to ask for whatever help or changes you need! In the meantime, I will work on fixing some bugs and cleaning things up.

Aug 27 2020, 11:25 PM · Data Model
lewo added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@samplet wow! that's pretty cool! Thank you ;)

Aug 27 2020, 10:28 PM · Data Model
ardumont added a comment to T2478: backfill origin-visit and origin-visit-status topics.

i guess so, yes.

Aug 27 2020, 5:18 PM · Storage manager, Data Model
ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.

well, storage is typed now but some endpoints remains inconsistent (as T645#47156 explicits with a unification proposal which is not done yet, aside content_get_data and content_get_metadata)

Aug 27 2020, 5:11 PM · Data Model, Storage manager
douardda closed T2478: backfill origin-visit and origin-visit-status topics, a subtask of T2310: Make origin visits immutable, as Wontfix.
Aug 27 2020, 4:43 PM · Storage manager, Data Model
douardda closed T2478: backfill origin-visit and origin-visit-status topics as Wontfix.

I guess this task can be closed, since this backfilling process will be part of the one we will run soon to fill the new kafka cluster

Aug 27 2020, 4:43 PM · Storage manager, Data Model
douardda added a comment to T645: Type swh-storage endpoints with swh.model objects.

what's missing for this task to be closed?

Aug 27 2020, 4:32 PM · Data Model, Storage manager

Aug 26 2020

samplet added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@rdicosmo the full ID is a better choice. Thanks!

Aug 26 2020, 3:45 PM · Data Model
vlorentz added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

I tried this as well, before trying pristine-tar. An issue I ran into is that storing the value of fields isn't enough, because there are multiple way to represent them in tar (eg. numbers are usually null-terminated strings, so \x00\x00\x00\x00, 0\x00\x00\x00, 0000, and \x00123 represent the same value).

Aug 26 2020, 10:06 AM · Data Model
rdicosmo added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Thank you @samplet for sharing this great work: I am really looking forward to see you and @vlorentz compare notes and see whether we can archive the three extra files you produce as extrinsic metadata in the SWH archive!

Aug 26 2020, 9:54 AM · Data Model
samplet added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Hi all,

Aug 26 2020, 6:37 AM · Data Model

Aug 19 2020

vlorentz triaged T2523: Archive opensource.samsung.com as Normal priority.
Aug 19 2020, 7:40 PM · Lister, Archive coverage

Aug 7 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3738: model: Add Sha1 alias.
Aug 7 2020, 9:54 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3737: Adapt type and rename content_get_metadata calls to content_get.
Aug 7 2020, 6:40 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3735: test_npm: Adapt content_get_metadata call to content_get.
Aug 7 2020, 12:59 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3734: indexer.rehash: Adapt content_get_metadata call to content_get.
Aug 7 2020, 12:53 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3733: storage*: Rename and type content_get(List[Sha1]) -> List[Optional[Content]].
Aug 7 2020, 12:45 AM · Data Model, Storage manager

Aug 6 2020

vlorentz added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

I started looking into this, using https://nix-community.github.io/nixpkgs-swh/sources-unstable.json as a source of archive files.

Aug 6 2020, 4:19 PM · Data Model
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3719: Adapt code according to storage signature.
Aug 6 2020, 9:58 AM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3718: "text"-indexers: Migrate to partition index instead of range.
Aug 6 2020, 9:50 AM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3717: Adapt code according to storage signature.
Aug 6 2020, 9:46 AM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3716: Adapt code according to storage signature.
Aug 6 2020, 9:40 AM · Data Model, Storage manager

Aug 5 2020

ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3715: in_memory: Drop dead code.
Aug 5 2020, 8:02 PM · Data Model, Storage manager
ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.
  1. {object}_missing for object in {content, directory, revision, release, snapshot}
Aug 5 2020, 5:45 PM · Data Model, Storage manager
ardumont added a comment to T645: Type swh-storage endpoints with swh.model objects.

I'll type with what we have right now, that will simplify the next diffs which introduce type changes.
But also demonstrates the inconsistencies we have right now.

Aug 5 2020, 5:44 PM · Data Model, Storage manager
ardumont closed T2517: Add remaining missing types to swh.storage.interface, a subtask of T645: Type swh-storage endpoints with swh.model objects, as Resolved.
Aug 5 2020, 5:29 PM · Data Model, Storage manager
ardumont closed T2517: Add remaining missing types to swh.storage.interface as Resolved.
Aug 5 2020, 5:29 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3713: storage*: content_get_partition(...) -> PagedResult[Content].
Aug 5 2020, 4:11 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3712: storage*: Drop deprecated content_get_range endpoint.
Aug 5 2020, 3:18 PM · Data Model, Storage manager
ardumont renamed T645: Type swh-storage endpoints with swh.model objects from Type swh-storage endpoints to Type swh-storage endpoints with swh.model objects.
Aug 5 2020, 2:06 PM · Data Model, Storage manager
ardumont removed revisions from T645: Type swh-storage endpoints with swh.model objects: D3708: storage*: origin_get_by_sha1: Drop generator from pgstorage, D3707: storage*: revision_*: Type remaining existing endpoints, D3706: storage*: directory_*: Type remaining existing endpoints, D3705: storage*: skipped_content_missing: Type remaining existing endpoints, D3704: storage*: content_missing_per_sha1(_git): Type remaining existing endpoints, D3703: storage*: content_missing: Unify and type remaining existing endpoints, D3702: storage*: content_get_partition: Type remaining existing endpoints, D3701: storage*: content_get_range: Type remaining existing endpoints, D3700: storage*: content_get: Type remaining existing endpoints, D3699: storage*: content_update: Type remaining existing endpoints, D3698: storage*: origin_get_by_sha1: Type remaining existing endpoints, D3697: storage*: check_config: Type remaining existing endpoints.
Aug 5 2020, 1:12 PM · Data Model, Storage manager
ardumont added revisions to T2517: Add remaining missing types to swh.storage.interface: D3708: storage*: origin_get_by_sha1: Drop generator from pgstorage, D3707: storage*: revision_*: Type remaining existing endpoints, D3706: storage*: directory_*: Type remaining existing endpoints, D3705: storage*: skipped_content_missing: Type remaining existing endpoints, D3704: storage*: content_missing_per_sha1(_git): Type remaining existing endpoints, D3703: storage*: content_missing: Unify and type remaining existing endpoints, D3702: storage*: content_get_partition: Type remaining existing endpoints, D3701: storage*: content_get_range: Type remaining existing endpoints, D3700: storage*: content_get: Type remaining existing endpoints, D3699: storage*: content_update: Type remaining existing endpoints, D3698: storage*: origin_get_by_sha1: Type remaining existing endpoints, D3697: storage*: check_config: Type remaining existing endpoints.
Aug 5 2020, 1:12 PM · Data Model, Storage manager
ardumont renamed T2517: Add remaining missing types to swh.storage.interface from Add remaining missing types to the interface to Add remaining missing types to swh.storage.interface.
Aug 5 2020, 1:10 PM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3711: storage*: object_find_by_sha1_git: Type remaining existing endpoints.
Aug 5 2020, 1:06 PM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3710: storage*: snapshot_*: Type remaining existing endpoints.
Aug 5 2020, 12:50 PM · Data Model, Storage manager
ardumont added a revision to T2517: Add remaining missing types to swh.storage.interface: D3709: storage*: release_*: Type remaining existing endpoints.
Aug 5 2020, 12:30 PM · Data Model, Storage manager
ardumont triaged T2517: Add remaining missing types to swh.storage.interface as Normal priority.
Aug 5 2020, 12:21 PM · Data Model, Storage manager
ardumont added a revision to T645: Type swh-storage endpoints with swh.model objects: D3708: storage*: origin_get_by_sha1: Drop generator from pgstorage.
Aug 5 2020, 9:55 AM · Data Model, Storage manager