Page MenuHomeSoftware Heritage
Feed Advanced Search

Sep 1 2021

vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 1 2021, 4:05 PM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 1 2021, 4:03 PM · Origin-GitHub, Extrinsic metadata
vlorentz added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

no and yes, respectively

Sep 1 2021, 4:02 PM · Origin-GitHub, Extrinsic metadata
douardda added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

do we need the "list of forks" if we keep the "fork of what"? I mean these are the 2 ends of the fork relation, right?

Sep 1 2021, 12:06 PM · Origin-GitHub, Extrinsic metadata

Aug 31 2021

vlorentz added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

"topics" (these are the "tags", right?)

Aug 31 2021, 4:01 PM · Origin-GitHub, Extrinsic metadata
zack added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

Here's an opinionated and prioritized list.

Aug 31 2021, 3:49 PM · Origin-GitHub, Extrinsic metadata
vlorentz added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

At the moment, I think that all the properties you have selected in the task are needed.

Aug 31 2021, 12:39 PM · Origin-GitHub, Extrinsic metadata
moranegg added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

At the moment, I think that all the properties you have selected in the task are needed.
+1 for License (it is something they show on the interface even if it is based on a heuristic).

Aug 31 2021, 12:08 PM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Aug 31 2021, 11:58 AM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Aug 31 2021, 11:55 AM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Aug 31 2021, 11:53 AM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Aug 31 2021, 11:52 AM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Aug 31 2021, 11:15 AM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Aug 31 2021, 11:15 AM · Origin-GitHub, Extrinsic metadata
vlorentz triaged T3542: Decide what metadata we want to / can collect from GitHub as Normal priority.
Aug 31 2021, 11:11 AM · Origin-GitHub, Extrinsic metadata

Aug 19 2021

zack updated the task description for T3490: Collect metadata from ClearlyDefined.
Aug 19 2021, 10:13 AM · Extrinsic metadata
vlorentz removed a subtask for T2202: Collect extrinsic metadata: T2513: Copy metadata on revisions to the extrinsic metadata storage.
Aug 19 2021, 9:20 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz updated the task description for T3490: Collect metadata from ClearlyDefined.
Aug 19 2021, 9:18 AM · Extrinsic metadata
vlorentz placed T3490: Collect metadata from ClearlyDefined up for grabs.
Aug 19 2021, 9:18 AM · Extrinsic metadata
vlorentz triaged T3490: Collect metadata from ClearlyDefined as Normal priority.
Aug 19 2021, 9:16 AM · Extrinsic metadata

Aug 11 2021

vlorentz closed T3478: Add examples to api/1/raw-extrinsic-metadata/swhid/authorities/doc/ as Resolved.

Done: 93ce62f0776432175c886f40e5d60c36203ed45f

Aug 11 2021, 9:30 AM · Documentation, Extrinsic metadata

Aug 10 2021

moranegg triaged T3478: Add examples to api/1/raw-extrinsic-metadata/swhid/authorities/doc/ as Normal priority.
Aug 10 2021, 6:10 PM · Documentation, Extrinsic metadata

Jul 15 2021

vlorentz closed T2938: Create API endpoint to access raw_extrinsic_metadata, a subtask of T3097: Expose metadata in the WebApp and make it searchable, as Resolved.
Jul 15 2021, 12:18 PM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task

Jun 29 2021

vlorentz renamed T2202: Collect extrinsic metadata from Extrinsic metadata to Collect extrinsic metadata.
Jun 29 2021, 10:53 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz changed the status of T3097: Expose metadata in the WebApp and make it searchable from Open to Work in Progress.
Jun 29 2021, 10:53 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task

Apr 26 2021

vlorentz raised the priority of T1747: Review APIs to get metadata from supported origins from Low to Normal.
Apr 26 2021, 10:51 AM · Extrinsic metadata

Apr 19 2021

vlorentz lowered the priority of T3273: Use "fork" relationships to speed-up initial load of large repositories from Normal to Low.
Apr 19 2021, 1:50 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz triaged T3273: Use "fork" relationships to speed-up initial load of large repositories as Normal priority.
Apr 19 2021, 1:49 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader

Apr 15 2021

vlorentz placed T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage up for grabs.
Apr 15 2021, 3:17 PM · Storage manager, Extrinsic metadata

Apr 2 2021

vlorentz claimed T2202: Collect extrinsic metadata.
Apr 2 2021, 10:12 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz claimed T3097: Expose metadata in the WebApp and make it searchable.
Apr 2 2021, 10:11 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task

Mar 23 2021

vlorentz closed T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, a subtask of T2668: Package loaders should write extrinsic metadata on directories instead of revisions/releases, as Resolved.
Mar 23 2021, 2:33 PM · Package Loader, Storage manager, Extrinsic metadata
vlorentz closed T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects as Resolved.
Mar 23 2021, 2:33 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz closed T3017: Use hashes as keys in swh.journal.objects.raw_extrinsic_metadata, a subtask of T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, as Resolved.
Mar 23 2021, 2:33 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz closed T3017: Use hashes as keys in swh.journal.objects.raw_extrinsic_metadata as Resolved.
Mar 23 2021, 2:33 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz closed T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra, a subtask of T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields, as Resolved.
Mar 23 2021, 2:32 PM · Storage manager, Extrinsic metadata
vlorentz closed T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra, a subtask of T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage, as Resolved.
Mar 23 2021, 2:32 PM · Storage manager, Extrinsic metadata
vlorentz closed T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra as Resolved.
Mar 23 2021, 2:32 PM · Storage manager, Extrinsic metadata
olasd closed T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql, a subtask of T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage, as Resolved.
Mar 23 2021, 2:31 PM · Storage manager, Extrinsic metadata
olasd closed T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql as Resolved.

After a lot of back and forth, and the release of swh.model v2.3.0 and swh.storage v0.26.0, this is now all done and deployed in staging and production.

Mar 23 2021, 2:31 PM · Storage manager, Extrinsic metadata
olasd closed T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql, a subtask of T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields, as Resolved.
Mar 23 2021, 2:31 PM · Storage manager, Extrinsic metadata
olasd closed T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields, a subtask of T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, as Resolved.
Mar 23 2021, 2:25 PM · Data Model, Storage manager, Extrinsic metadata
olasd closed T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields as Resolved.

After the release of swh.model v2, this is now done.

Mar 23 2021, 2:25 PM · Storage manager, Extrinsic metadata

Mar 15 2021

rdicosmo updated the task description for T2202: Collect extrinsic metadata.
Mar 15 2021, 9:08 PM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata

Mar 8 2021

vlorentz triaged T3097: Expose metadata in the WebApp and make it searchable as Normal priority.
Mar 8 2021, 11:41 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task
rdicosmo updated the task description for T3097: Expose metadata in the WebApp and make it searchable.
Mar 8 2021, 10:44 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task
rdicosmo added subtasks for T3097: Expose metadata in the WebApp and make it searchable: T2073: Index extrinsic metadata from the journal in swh-search/Elasticsearch, T2938: Create API endpoint to access raw_extrinsic_metadata, T2088: Specify and draw metadata view on web-app, T2191: Metadata Views.
Mar 8 2021, 10:33 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task
rdicosmo created T3097: Expose metadata in the WebApp and make it searchable.
Mar 8 2021, 10:31 AM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task

Mar 5 2021

vlorentz added a revision to T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra: D5030: raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique.
Mar 5 2021, 3:52 PM · Storage manager, Extrinsic metadata
vlorentz added a revision to T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql: D5029: Add raw_extrinsic_metadata.id column in postgresql..
Mar 5 2021, 3:52 PM · Storage manager, Extrinsic metadata
vlorentz closed T3074: Migrate all packages away from the old SWHID class, a subtask of T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage, as Resolved.
Mar 5 2021, 12:31 PM · Storage manager, Extrinsic metadata
vlorentz closed T3074: Migrate all packages away from the old SWHID class, a subtask of T3017: Use hashes as keys in swh.journal.objects.raw_extrinsic_metadata, as Resolved.
Mar 5 2021, 12:31 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz closed T3074: Migrate all packages away from the old SWHID class, a subtask of T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql, as Resolved.
Mar 5 2021, 12:31 PM · Storage manager, Extrinsic metadata
vlorentz closed T3074: Migrate all packages away from the old SWHID class, a subtask of T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra, as Resolved.
Mar 5 2021, 12:31 PM · Storage manager, Extrinsic metadata

Feb 26 2021

vlorentz added a subtask for T3017: Use hashes as keys in swh.journal.objects.raw_extrinsic_metadata: T3074: Migrate all packages away from the old SWHID class.
Feb 26 2021, 6:58 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz added a subtask for T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage: T3074: Migrate all packages away from the old SWHID class.
Feb 26 2021, 6:58 PM · Storage manager, Extrinsic metadata
vlorentz added a subtask for T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql: T3074: Migrate all packages away from the old SWHID class.
Feb 26 2021, 6:58 PM · Storage manager, Extrinsic metadata
vlorentz added a subtask for T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra: T3074: Migrate all packages away from the old SWHID class.
Feb 26 2021, 6:57 PM · Storage manager, Extrinsic metadata

Feb 12 2021

moranegg edited projects for T2693: fetch extrinsic origin metadata from GitLab instances, added: Extrinsic metadata; removed Metadata workflow.
Feb 12 2021, 4:35 PM · Extrinsic metadata, Origin-GitLab
moranegg edited projects for T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders, added: Extrinsic metadata; removed Metadata workflow.
Feb 12 2021, 4:34 PM · Extrinsic metadata
moranegg edited projects for T1747: Review APIs to get metadata from supported origins, added: Extrinsic metadata; removed Metadata workflow.
Feb 12 2021, 4:34 PM · Extrinsic metadata

Feb 3 2021

vlorentz added a revision to T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: D4970: model: Add 'id' field to RawExtrinsicMetadata.
Feb 3 2021, 9:13 AM · Data Model, Storage manager, Extrinsic metadata

Feb 2 2021

vlorentz lowered the priority of T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage from High to Low.

Actually, this is probably not needed. (I may close this task later)

Feb 2 2021, 8:50 PM · Storage manager, Extrinsic metadata
vlorentz added a parent task for T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql: T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields.
Feb 2 2021, 2:22 PM · Storage manager, Extrinsic metadata
vlorentz added a parent task for T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra: T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields.
Feb 2 2021, 2:22 PM · Storage manager, Extrinsic metadata
vlorentz added subtasks for T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields: T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql, T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra.
Feb 2 2021, 2:22 PM · Storage manager, Extrinsic metadata
vlorentz renamed T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra from Allow querying raw_extrinsic_metadata by hash in swh.storage.cassandra to Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra.
Feb 2 2021, 2:21 PM · Storage manager, Extrinsic metadata
vlorentz renamed T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql from Allow querying raw_extrinsic_metadata by hash in swh.storage.postgresql to Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql.
Feb 2 2021, 2:21 PM · Storage manager, Extrinsic metadata
vlorentz triaged T3022: Deduplicate RawExtrinsicMetadata by hash instead of a subset of their fields as High priority.
Feb 2 2021, 2:15 PM · Storage manager, Extrinsic metadata
vlorentz updated subscribers of T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra.
Feb 2 2021, 1:40 PM · Storage manager, Extrinsic metadata
vlorentz updated subscribers of T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql.
Feb 2 2021, 1:40 PM · Storage manager, Extrinsic metadata
vlorentz updated subscribers of T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage.
Feb 2 2021, 1:40 PM · Storage manager, Extrinsic metadata
vlorentz updated subscribers of T3017: Use hashes as keys in swh.journal.objects.raw_extrinsic_metadata.
Feb 2 2021, 1:40 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz triaged T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra as High priority.
Feb 2 2021, 1:40 PM · Storage manager, Extrinsic metadata
vlorentz triaged T3019: Add an index for raw_extrinsic_metadata.id in swh.storage.postgresql as High priority.
Feb 2 2021, 1:37 PM · Storage manager, Extrinsic metadata
vlorentz triaged T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage as High priority.
Feb 2 2021, 1:35 PM · Storage manager, Extrinsic metadata
vlorentz triaged T3017: Use hashes as keys in swh.journal.objects.raw_extrinsic_metadata as High priority.
Feb 2 2021, 1:34 PM · Data Model, Storage manager, Extrinsic metadata

Jan 25 2021

vlorentz added a revision to T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: D4935: identifiers: Add raw_extrinsic_metadata_identifier.
Jan 25 2021, 12:32 PM · Data Model, Storage manager, Extrinsic metadata

Jan 11 2021

vlorentz claimed T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects.
Jan 11 2021, 4:27 PM · Data Model, Storage manager, Extrinsic metadata

Jan 7 2021

olasd added a comment to T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects.

Sounds good to me.

Jan 7 2021, 8:18 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz added a parent task for T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: T2779: Put information (client, collection and deposit-id) inside metadata for metadata-only deposit.
Jan 7 2021, 1:53 PM · Data Model, Storage manager, Extrinsic metadata

Jan 5 2021

vlorentz added a comment to T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects.

Proposed manifest format:

Jan 5 2021, 3:21 PM · Data Model, Storage manager, Extrinsic metadata

Dec 7 2020

moranegg closed T2311: Review the deposit of CodeMeta metadata in xml (following SWORD V2 specs) , a subtask of T2202: Collect extrinsic metadata, as Resolved.
Dec 7 2020, 4:03 PM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata

Nov 23 2020

vlorentz added a parent task for T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: T2513: Copy metadata on revisions to the extrinsic metadata storage.
Nov 23 2020, 12:00 PM · Data Model, Storage manager, Extrinsic metadata

Nov 2 2020

vlorentz closed T2667: Decide what to do with PyPI snapshot metadata as Resolved.
Nov 2 2020, 1:51 PM · Extrinsic metadata, PyPI loader
vlorentz closed T2668: Package loaders should write extrinsic metadata on directories instead of revisions/releases as Resolved.
Nov 2 2020, 12:23 PM · Package Loader, Storage manager, Extrinsic metadata

Oct 23 2020

vlorentz added a revision to T2668: Package loaders should write extrinsic metadata on directories instead of revisions/releases: D4349: migrate_extrinsic_metadata: Write metadata on directories instead of revisions..
Oct 23 2020, 5:26 PM · Package Loader, Storage manager, Extrinsic metadata
olasd added a revision to T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: D4348: Rename the RawExtrinsicMetadata id field to target.
Oct 23 2020, 5:20 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz added revisions to T2668: Package loaders should write extrinsic metadata on directories instead of revisions/releases: D4346: package loaders: write extrinsic metadata to directories instead of revisions., D4347: package loaders: write original_artifact metadata to directories instead of revisions..
Oct 23 2020, 5:01 PM · Package Loader, Storage manager, Extrinsic metadata

Oct 19 2020

olasd added a revision to T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: D4307: Update the HashableObject interface to take the object itself.
Oct 19 2020, 4:26 PM · Data Model, Storage manager, Extrinsic metadata
olasd added a revision to T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority: D4307: Update the HashableObject interface to take the object itself.
Oct 19 2020, 4:26 PM · Data Model, Storage manager, Extrinsic metadata

Oct 14 2020

olasd added a comment to T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority.

This line of reasoning makes sense to me.

Oct 14 2020, 3:03 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz removed a subtask for T2668: Package loaders should write extrinsic metadata on directories instead of revisions/releases: T2686: Use hashes for all kafka keys.
Oct 14 2020, 2:08 PM · Package Loader, Storage manager, Extrinsic metadata
vlorentz added a parent task for T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects: T2686: Use hashes for all kafka keys.
Oct 14 2020, 2:08 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz added a parent task for T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority: T2686: Use hashes for all kafka keys.
Oct 14 2020, 2:08 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz triaged T2704: Use a hash as id/ unicity key for MetadataFetcher and MetadataAuthority as High priority.
Oct 14 2020, 2:07 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz edited projects for T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, added: Data Model; removed Package Loader.
Oct 14 2020, 2:01 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz renamed T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects from Use intrinsic identifiers for RawExtrinsicMetadata objects to Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects.
Oct 14 2020, 2:01 PM · Data Model, Storage manager, Extrinsic metadata
vlorentz triaged T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects as High priority.
Oct 14 2020, 2:01 PM · Data Model, Storage manager, Extrinsic metadata

Oct 13 2020

vlorentz added a revision to T2667: Decide what to do with PyPI snapshot metadata: D4242: pypi: write metadata on revisions instead of snapshots..
Oct 13 2020, 11:04 AM · Extrinsic metadata, PyPI loader