Page MenuHomeSoftware Heritage
Feed Advanced Search

Aug 24 2022

vlorentz triaged T4457: Index metadata from Gitea/Gogs as Normal priority.
Aug 24 2022, 12:37 PM · Origin-Gitea/Gogs, Extrinsic metadata, Indexer
vlorentz added a revision to T4451: Fetch extrinsic metadata from Gitea/Gogs: D8303: Add Gitea/Gogs metadata fetcher.
Aug 24 2022, 12:34 PM · Origin-Gitea/Gogs, Extrinsic metadata

Aug 22 2022

vlorentz triaged T4451: Fetch extrinsic metadata from Gitea/Gogs as Normal priority.
Aug 22 2022, 10:44 AM · Origin-Gitea/Gogs, Extrinsic metadata

Jul 19 2022

vlorentz added a subtask for T3097: Expose metadata in the WebApp and make it searchable: T2064: Add metadata from deposits to metadata search.
Jul 19 2022, 1:05 PM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task
vlorentz added a subtask for T3097: Expose metadata in the WebApp and make it searchable: Restricted Maniphest Task.
Jul 19 2022, 1:02 PM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task

Jul 18 2022

vlorentz triaged T4394: Add support for running metadata fetchers without a VCS/package loaders as Normal priority.
Jul 18 2022, 10:33 AM · Extrinsic metadata
vlorentz closed T4377: Create API endpoint to expose raw extrinsic metadata on origins, a subtask of T2202: Collect extrinsic metadata, as Resolved.
Jul 18 2022, 10:31 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz closed T4377: Create API endpoint to expose raw extrinsic metadata on origins as Resolved.
Jul 18 2022, 10:31 AM · Web app, Extrinsic metadata

Jul 13 2022

vlorentz added revisions to T4377: Create API endpoint to expose raw extrinsic metadata on origins: D8119: Add API endpoint to get the list of metadata authorities on an origin from its URL, D8120: Add link to origin extrinsic metadata API endpoint from the UI.
Jul 13 2022, 10:29 AM · Web app, Extrinsic metadata

Jul 12 2022

vlorentz added a revision to T4377: Create API endpoint to expose raw extrinsic metadata on origins: D8114: Add support for querying raw-extrinsic-metadata for origins.
Jul 12 2022, 2:12 PM · Web app, Extrinsic metadata

Jul 6 2022

vlorentz added a comment to T2693: fetch extrinsic origin metadata from GitLab instances.

GitLab returns very little data while logged out, so we won't be able to collect much. This seems to differ from their documentation, so I opened a ticket https://gitlab.com/gitlab-org/gitlab/-/issues/361952

Jul 6 2022, 12:41 PM · Extrinsic metadata, Origin-GitLab

Jul 5 2022

vlorentz closed T833: When listing an origin, add origin level metadata to RMD storage as Wontfix.

replaced loader-based metadata loading (T4188 / T4186)

Jul 5 2022, 6:33 PM · Extrinsic metadata, Restricted Project, GitHub lister
vlorentz closed T833: When listing an origin, add origin level metadata to RMD storage, a subtask of T2202: Collect extrinsic metadata, as Wontfix.
Jul 5 2022, 6:33 PM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz closed T1747: Review APIs to get metadata from supported origins, a subtask of T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders, as Resolved.
Jul 5 2022, 5:28 PM · Extrinsic metadata
vlorentz closed T1747: Review APIs to get metadata from supported origins as Resolved.
Jul 5 2022, 5:28 PM · Extrinsic metadata
vlorentz renamed T4377: Create API endpoint to expose raw extrinsic metadata on origins from Create API endpoint to expose raw extrinsic metadata from forges to Create API endpoint to expose raw extrinsic metadata on origins.
Jul 5 2022, 5:00 PM · Web app, Extrinsic metadata
vlorentz edited projects for T4377: Create API endpoint to expose raw extrinsic metadata on origins, added: Web app; removed Roadmap 2022.
Jul 5 2022, 12:17 PM · Web app, Extrinsic metadata
moranegg triaged T4377: Create API endpoint to expose raw extrinsic metadata on origins as High priority.
Jul 5 2022, 12:13 PM · Web app, Extrinsic metadata

Jun 2 2022

vlorentz moved T2202: Collect extrinsic metadata from Backlog to Work in progress on the Roadmap 2022 board.
Jun 2 2022, 9:57 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz removed a project from T3490: Collect metadata from ClearlyDefined: meta-task.
Jun 2 2022, 9:56 AM · Extrinsic metadata

May 30 2022

vlorentz added a parent task for T3273: Use "fork" relationships to speed-up initial load of large repositories: T4283: Load https://github.com/chromium/chromium with a higher packfile size limit.
May 30 2022, 3:41 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader

May 17 2022

vlorentz lowered the priority of T4252: Schedule recurring fetches of origin metadata from High to Normal.
May 17 2022, 3:11 PM · Extrinsic metadata
vlorentz updated the task description for T4252: Schedule recurring fetches of origin metadata.
May 17 2022, 3:06 PM · Extrinsic metadata
vlorentz triaged T4252: Schedule recurring fetches of origin metadata as High priority.
May 17 2022, 3:06 PM · Extrinsic metadata

May 13 2022

vlorentz added a revision to T3273: Use "fork" relationships to speed-up initial load of large repositories: D7831: Use all base snapshots in determine_wants().
May 13 2022, 3:23 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader

May 10 2022

vlorentz added a comment to T3273: Use "fork" relationships to speed-up initial load of large repositories.

Currently can't do it on GitLab while logged out: https://gitlab.com/gitlab-org/gitlab/-/issues/361952

May 10 2022, 4:13 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz placed T3859: investigate using metadata from GHTorrent up for grabs.
May 10 2022, 4:12 PM · Extrinsic metadata

May 3 2022

vlorentz removed a parent task for T2202: Collect extrinsic metadata: T3273: Use "fork" relationships to speed-up initial load of large repositories.
May 3 2022, 11:16 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz removed a subtask for T3273: Use "fork" relationships to speed-up initial load of large repositories: T2202: Collect extrinsic metadata.
May 3 2022, 11:16 AM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added a parent task for T2202: Collect extrinsic metadata: T3273: Use "fork" relationships to speed-up initial load of large repositories.
May 3 2022, 11:16 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz added subtasks for T3273: Use "fork" relationships to speed-up initial load of large repositories: T1740: fetch extrinsic origin metadata from GitHub, T2202: Collect extrinsic metadata.
May 3 2022, 11:16 AM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added a subtask for T3273: Use "fork" relationships to speed-up initial load of large repositories: T4219: Investigate why GitHub fork detection did not bring a speed-up.
May 3 2022, 11:15 AM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz placed T3558: Enable the swh-search QL in production up for grabs.
May 3 2022, 11:08 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz closed T1740: fetch extrinsic origin metadata from GitHub, a subtask of T833: When listing an origin, add origin level metadata to RMD storage, as Resolved.
May 3 2022, 11:08 AM · Extrinsic metadata, Restricted Project, GitHub lister
vlorentz closed T1740: fetch extrinsic origin metadata from GitHub, a subtask of T2202: Collect extrinsic metadata, as Resolved.
May 3 2022, 11:08 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata

Apr 29 2022

ardumont closed T4206: prod: Deploy metadata loader v0.0.2, a subtask of T3273: Use "fork" relationships to speed-up initial load of large repositories, as Resolved.
Apr 29 2022, 11:27 AM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader

Apr 28 2022

ardumont changed the status of T4206: prod: Deploy metadata loader v0.0.2, a subtask of T3273: Use "fork" relationships to speed-up initial load of large repositories, from Open to Work in Progress.
Apr 28 2022, 3:43 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz edited projects for T3273: Use "fork" relationships to speed-up initial load of large repositories, added: Origin-GitHub; removed GitHub lister.
Apr 28 2022, 3:27 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz edited projects for T3273: Use "fork" relationships to speed-up initial load of large repositories, added: Origin-GitLab; removed GitLab migration.
Apr 28 2022, 3:27 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added projects to T3273: Use "fork" relationships to speed-up initial load of large repositories: GitHub lister, GitLab migration.
Apr 28 2022, 3:27 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added a project to T3273: Use "fork" relationships to speed-up initial load of large repositories: Git loader.
Apr 28 2022, 3:26 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added a subtask for T3273: Use "fork" relationships to speed-up initial load of large repositories: T4206: prod: Deploy metadata loader v0.0.2.
Apr 28 2022, 3:26 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader

Apr 27 2022

vlorentz placed T3559: Enable the swh-search QL in staging up for grabs.
Apr 27 2022, 2:28 PM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz claimed T3273: Use "fork" relationships to speed-up initial load of large repositories.
Apr 27 2022, 2:12 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader
vlorentz added revisions to T3273: Use "fork" relationships to speed-up initial load of large repositories: D7691: Store the result of MetadataFetcher.get_parent_origins, D7695: Replace 'base_url' argument with 'self.parent_origins' attribute, D7663: Add method get_parent_origins().
Apr 27 2022, 2:07 PM · Origin-GitHub, Origin-GitLab, Git loader, Extrinsic metadata, Core Loader

Apr 22 2022

pratyush added a comment to T1747: Review APIs to get metadata from supported origins.

WIP

Apr 22 2022, 4:44 PM · Extrinsic metadata

Apr 21 2022

vlorentz removed a parent task for T3859: investigate using metadata from GHTorrent: T1740: fetch extrinsic origin metadata from GitHub.
Apr 21 2022, 8:39 PM · Extrinsic metadata
vlorentz added a comment to T3859: investigate using metadata from GHTorrent.

Removing this from the subtasks of T1740; given that we decided in T1739 to query the API at the same time we load origins

Apr 21 2022, 8:39 PM · Extrinsic metadata
vlorentz added a parent task for T2693: fetch extrinsic origin metadata from GitLab instances: T2202: Collect extrinsic metadata.
Apr 21 2022, 9:01 AM · Extrinsic metadata, Origin-GitLab
vlorentz added a subtask for T2202: Collect extrinsic metadata: T2693: fetch extrinsic origin metadata from GitLab instances.
Apr 21 2022, 9:01 AM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
vlorentz closed T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders, a subtask of T2693: fetch extrinsic origin metadata from GitLab instances, as Resolved.
Apr 21 2022, 9:00 AM · Extrinsic metadata, Origin-GitLab
vlorentz closed T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders as Resolved.

I started working this design. We'll see if it needs to change later

Apr 21 2022, 9:00 AM · Extrinsic metadata

Apr 19 2022

vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Apr 19 2022, 12:10 PM · Origin-GitHub, Extrinsic metadata
vlorentz added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

In summary, we would archive everything with priority "high" or "mid", as well as the "license" and "main language" fields, as they are all easy to fetch and store

Apr 19 2022, 11:18 AM · Origin-GitHub, Extrinsic metadata

Apr 18 2022

pratyush added a comment to T1747: Review APIs to get metadata from supported origins.
idname [url]typemethodsauth/throttlecode_sourcemetadata_sourcemetadata_conformanceetl_codestatus
Apr 18 2022, 8:46 PM · Extrinsic metadata
pratyush added a comment to T1747: Review APIs to get metadata from supported origins.
Apr 18 2022, 7:55 PM · Extrinsic metadata

Apr 11 2022

vlorentz added a comment to T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders.
In T1739#82939, @olasd wrote:

Yes, all these are good points. As long as forges don't provide a way of loading the metadata in bulk, it makes sense to do it at the same time as loading.

Apr 11 2022, 2:44 PM · Extrinsic metadata
olasd added a comment to T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders.

The original idea for this was to have separate tasks to fetch metadata, so that loaders did not have forge-specific code to fetch metadata.

However, the idea of loading metadata from loader is more appealing the more I think about it:

  1. Metadata are fetched at about the same time as we snapshot code; which would allow showing more consistent states of repositories
  2. Active repositories automatically have their metadata fetched more often than inactive ones
  3. We don't have one more moving part to monitor and schedule
  4. This allows the Git loader to know a new repo is a "forge fork" of another one before it starts loading, so it can do an incremental load
Apr 11 2022, 2:36 PM · Extrinsic metadata
moranegg added a comment to T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders.

To me the advantages are strong ,especially point 1 and 4.

Apr 11 2022, 2:11 PM · Extrinsic metadata
vlorentz added a comment to T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders.

The original idea for this was to have separate tasks to fetch metadata, so that loaders did not have forge-specific code to fetch metadata.

Apr 11 2022, 1:45 PM · Extrinsic metadata
vlorentz closed T3542: Decide what metadata we want to / can collect from GitHub as Resolved.

Looks like *what* we want to collect is a solved issue.

Apr 11 2022, 9:46 AM · Origin-GitHub, Extrinsic metadata
vlorentz closed T3542: Decide what metadata we want to / can collect from GitHub, a subtask of T1747: Review APIs to get metadata from supported origins, as Resolved.
Apr 11 2022, 9:46 AM · Extrinsic metadata

Mar 25 2022

bchauvet raised the priority of T2202: Collect extrinsic metadata from Normal to High.
Mar 25 2022, 5:30 PM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata

Mar 23 2022

bchauvet added a project to T2202: Collect extrinsic metadata: Roadmap 2022.
Mar 23 2022, 4:48 PM · Roadmap 2022, meta-task, Roadmap 2021, Extrinsic metadata
bchauvet added a parent task for T3097: Expose metadata in the WebApp and make it searchable: T4081: Show metadata on Web UI.
Mar 23 2022, 4:45 PM · Intrinsic metadata, Extrinsic metadata, Roadmap 2021, meta-task

Feb 23 2022

anlambert closed T3967: "Link" header is not properly displayed in apidoc when it contains [], a subtask of T3559: Enable the swh-search QL in staging, as Resolved.
Feb 23 2022, 5:39 PM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata

Feb 22 2022

vlorentz added a parent task for T3558: Enable the swh-search QL in production: T3952: Make the search query language a first class citizen .
Feb 22 2022, 6:56 PM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata

Feb 21 2022

vlorentz added a subtask for T3559: Enable the swh-search QL in staging: T3967: "Link" header is not properly displayed in apidoc when it contains [].
Feb 21 2022, 3:42 PM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata

Feb 7 2022

vlorentz updated the task description for T3558: Enable the swh-search QL in production.
Feb 7 2022, 10:24 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz updated the task description for T3559: Enable the swh-search QL in staging.
Feb 7 2022, 10:23 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata

Jan 18 2022

vlorentz added a parent task for T3859: investigate using metadata from GHTorrent: T1740: fetch extrinsic origin metadata from GitHub.
Jan 18 2022, 12:43 PM · Extrinsic metadata
vlorentz triaged T3859: investigate using metadata from GHTorrent as Normal priority.
Jan 18 2022, 12:43 PM · Extrinsic metadata

Jan 13 2022

vsellier removed a revision from T833: When listing an origin, add origin level metadata to RMD storage: D6946: netbox: use the centralized admin db.
Jan 13 2022, 4:24 PM · Extrinsic metadata, Restricted Project, GitHub lister
vsellier added a revision to T833: When listing an origin, add origin level metadata to RMD storage: D6946: netbox: use the centralized admin db.
Jan 13 2022, 4:21 PM · Extrinsic metadata, Restricted Project, GitHub lister

Nov 22 2021

vlorentz closed T3636: Make the opam loader write extrinsic metadata as Resolved.
Nov 22 2021, 2:44 PM · Extrinsic metadata, Opam

Nov 10 2021

ardumont closed T3722: staging: Deploy package loader v1.0, deposit server v0.16, lister v2.3, a subtask of T3636: Make the opam loader write extrinsic metadata, as Resolved.
Nov 10 2021, 4:43 PM · Extrinsic metadata, Opam
ardumont changed the status of T3722: staging: Deploy package loader v1.0, deposit server v0.16, lister v2.3, a subtask of T3636: Make the opam loader write extrinsic metadata, from Open to Work in Progress.
Nov 10 2021, 3:33 PM · Extrinsic metadata, Opam
vlorentz added a subtask for T3636: Make the opam loader write extrinsic metadata: T3722: staging: Deploy package loader v1.0, deposit server v0.16, lister v2.3.
Nov 10 2021, 3:20 PM · Extrinsic metadata, Opam

Nov 8 2021

vlorentz added a revision to T3636: Make the opam loader write extrinsic metadata: D6606: opam: Write package definitions to the extrinsic metadata storage.
Nov 8 2021, 11:58 AM · Extrinsic metadata, Opam

Oct 21 2021

vlorentz closed T1344: Write specs about metadata workflow , a subtask of T833: When listing an origin, add origin level metadata to RMD storage, as Resolved.
Oct 21 2021, 2:12 PM · Extrinsic metadata, Restricted Project, GitHub lister
moranegg added a subtask for T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders: T3681: Review extrinsic metadata specification.
Oct 21 2021, 12:59 PM · Extrinsic metadata
moranegg added a parent task for T3681: Review extrinsic metadata specification: T1739: Define an architecture to fetch extrinsic metadata outside listers and loaders.
Oct 21 2021, 12:59 PM · Extrinsic metadata
moranegg added a comment to T3681: Review extrinsic metadata specification.

I think we (or I should say- I) missed ambiguity of the concept origin.

Oct 21 2021, 12:36 PM · Extrinsic metadata
moranegg triaged T3681: Review extrinsic metadata specification as Normal priority.
Oct 21 2021, 12:33 PM · Extrinsic metadata

Oct 8 2021

vlorentz added a comment to T3636: Make the opam loader write extrinsic metadata.

(this task is a dependency of T3638, because author != committer in revisions created by opam, and releases don't have a "committer" field, so switching to releases would lose this data)

Oct 8 2021, 2:33 PM · Extrinsic metadata, Opam
vlorentz added a parent task for T3636: Make the opam loader write extrinsic metadata: T3638: Make package loaders create releases objects instead of revisions.
Oct 8 2021, 2:32 PM · Extrinsic metadata, Opam
vlorentz edited projects for T3636: Make the opam loader write extrinsic metadata, added: Extrinsic metadata; removed Metadata workflow.
Oct 8 2021, 2:26 PM · Extrinsic metadata, Opam

Sep 6 2021

vlorentz removed a project from T3559: Enable the swh-search QL in staging: meta-task.
Sep 6 2021, 10:37 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz removed a project from T3558: Enable the swh-search QL in production: meta-task.
Sep 6 2021, 10:37 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz added a project to T3558: Enable the swh-search QL in production: Archive search.
Sep 6 2021, 10:36 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz triaged T3559: Enable the swh-search QL in staging as Normal priority.
Sep 6 2021, 10:36 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz added a project to T3558: Enable the swh-search QL in production: System administration.
Sep 6 2021, 10:36 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz triaged T3558: Enable the swh-search QL in production as Normal priority.
Sep 6 2021, 10:36 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata

Sep 3 2021

vlorentz added a revision to T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage: D5865: Add endpoints to access REMD by id.
Sep 3 2021, 11:38 AM · Storage manager, Extrinsic metadata
vlorentz closed T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage, a subtask of T2703: Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects, as Resolved.
Sep 3 2021, 11:38 AM · Data Model, Storage manager, Extrinsic metadata
vlorentz closed T3018: Allow querying raw_extrinsic_metadata by hash in swh-storage as Resolved.
Sep 3 2021, 11:38 AM · Storage manager, Extrinsic metadata

Sep 2 2021

vlorentz added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

I updated the task with a breakdown of the cost of getting each info.

Sep 2 2021, 12:12 PM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 2 2021, 12:04 PM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 2 2021, 11:56 AM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 2 2021, 11:54 AM · Origin-GitHub, Extrinsic metadata