Page MenuHomeSoftware Heritage

GeneralFolder
ActivePublic

Members

  • This project does not have any members.
  • View All

Watchers

  • This project does not have any watchers.
  • View All

Details

Description

general Software Heritage product, for issues that cannot be classified more specifically (yet)

Recent Activity

Jul 2 2020

rdicosmo raised the priority of T1099: support origin blacklist for archive search and browse from Low to High.

This is an important feature: it has been dormant for a while, but we need to actually start implementing it.

Jul 2 2020, 8:21 PM · General, Web app

Jun 19 2020

zack triaged T2460: public journal of notable archiving policy changes as Normal priority.
Jun 19 2020, 9:54 AM · General

Jun 18 2020

anlambert edited projects for T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths, added: UX; removed 2019 UX audit.
Jun 18 2020, 11:52 AM · UX, Web app, General

Jun 9 2020

anlambert moved T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths from Backlog to Deployed on the 2019 UX audit board.
Jun 9 2020, 4:26 PM · UX, Web app, General

May 6 2020

anlambert added a revision to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths: D3129: common/identifiers: Add SWHIDs contextual information computation.
May 6 2020, 4:09 PM · UX, Web app, General

Apr 17 2020

anlambert closed T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths as Resolved.

It seems Phabricator reopened the task automatically with my last comment, that was not intended.

Apr 17 2020, 11:26 AM · UX, Web app, General
anlambert reopened T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths as "Work in Progress".

@moranegg , for the branch case the anchor will be the revision it points to. For your example, it will be

Apr 17 2020, 9:59 AM · UX, Web app, General

Apr 16 2020

moranegg added a parent task for T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths: T2366: Review Persistent identifiers (PIDs) with context in deposit.
Apr 16 2020, 11:26 PM · UX, Web app, General
moranegg added a comment to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.

Just a question about using a path with a different branch, for example for a tag of a version (which is not a release):

  • in this case, the anchor is the snp and the branch name (the tag) is in the path?
Apr 16 2020, 11:20 PM · UX, Web app, General

Mar 30 2020

rdicosmo closed T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths as Resolved.

This is now done in the few commits leading to https://forge.softwareheritage.org/rDMODaccca603c42ad68252532222ca6467a19691524e

Mar 30 2020, 4:59 PM · UX, Web app, General

Mar 27 2020

rdicosmo updated the task description for T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.
Mar 27 2020, 1:40 PM · UX, Web app, General

Mar 24 2020

rdicosmo added a comment to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.

@zack thanks for spotting the missing pieces... now fixed in the description, we're ready to go! :-)
Would you take care of extending the definition in https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html ?

Mar 24 2020, 6:05 PM · UX, Web app, General
zack added a comment to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.

@rdicosmo: the current version of the full example above LGTM (the surrounding text is inconsistent, e.g., it still mentions "snp" as a key and forbids snapshot anchors, but I suspect it's just that you didn't bother editing everything. Hence, we're good! :-))

Mar 24 2020, 6:00 PM · UX, Web app, General

Mar 23 2020

rdicosmo added a comment to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.

Update the proposal with visit instead of snp

Mar 23 2020, 9:07 PM · UX, Web app, General
rdicosmo added a comment to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.

About the anchor point: no objection to having also shapshot as a possible anchor in the schema.

Mar 23 2020, 7:08 PM · UX, Web app, General
zack added a comment to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.

(removed the last point, the hierarchy thing is in fact not relevant here, as we're pointing upward, not downward)

Mar 23 2020, 5:20 PM · UX, Web app, General
zack added a comment to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.

LGTM in general.

Mar 23 2020, 5:19 PM · UX, Web app, General
rdicosmo added a parent task for T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths: T2330: Simplify Permalinks box.
Mar 23 2020, 4:28 PM · UX, Web app, General
rdicosmo removed a project from T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths: Restricted Project.
Mar 23 2020, 4:28 PM · UX, Web app, General
rdicosmo added projects to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths: Restricted Project, 2019 UX audit.
Mar 23 2020, 4:11 PM · UX, Web app, General
rdicosmo raised the priority of T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths from Low to Normal.
Mar 23 2020, 4:00 PM · UX, Web app, General
rdicosmo changed the status of T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths from Open to Work in Progress.

As part of the discussion about the revamped UX, we have simplified the proposal for describing paths in the Merkle DAG. When the anchor denotes a revision (and most often when it's a release), it's trivial to find in the DAG the root directory of the source code, and we only need the file path to identify the content we are interested in. When it's a snapshot, there is a default root directory to point to.

Mar 23 2020, 3:56 PM · UX, Web app, General

Jan 23 2020

olasd closed T533: Allow loaders to register partial state (meta task) as Resolved.

This is done in all loaders by now.

Jan 23 2020, 2:12 PM · General
olasd closed T535: Register partial task state in scheduler database, a subtask of T533: Allow loaders to register partial state (meta task), as Resolved.
Jan 23 2020, 1:57 PM · General

Jan 22 2020

vlorentz renamed T1102: Handle all GitHub elements from Handle all GitHub elements (meta task) to Handle all GitHub elements.
Jan 22 2020, 4:24 PM · meta-task, General

Nov 7 2019

olasd placed T349: Investigate alternatives to Celery + RabbitMQ up for grabs.
Nov 7 2019, 2:18 PM · General

Jun 18 2019

ardumont closed T807: dogfooding: ingest the Software Heritage forge into the archive (via the canonical URLs) as Resolved.
Jun 18 2019, 2:19 PM · General

Jun 17 2019

zack closed T239: preserve at least 2 copies of each content object as Resolved.

resolved (by T691)

Jun 17 2019, 4:45 PM · General
zack added a comment to T691: complete object storage mirror on Azure (meta task).
In T691#33551, @olasd wrote:

After processing the logs of the backfilling process to make sure to redo all the ranges that were interrupted in various database migrations, I'm now confident that this task is complete: we have a full mirror of all contents on Azure, which is kept up to date by the main archive storage backend writing synchronously to it.

Jun 17 2019, 4:45 PM · General
olasd closed T691: complete object storage mirror on Azure (meta task) as Resolved.

After processing the logs of the backfilling process to make sure to redo all the ranges that were interrupted in various database migrations, I'm now confident that this task is complete: we have a full mirror of all contents on Azure, which is kept up to date by the main archive storage backend writing synchronously to it.

Jun 17 2019, 4:25 PM · General
olasd closed T691: complete object storage mirror on Azure (meta task), a subtask of T239: preserve at least 2 copies of each content object, as Resolved.
Jun 17 2019, 4:25 PM · General

Jun 7 2019

olasd added a comment to T691: complete object storage mirror on Azure (meta task).
  • The main archive currently synchronously writes all contents to Azure as well as the local storage (the gap is strictly closing)
  • all partitions from uffizi have been copied to azure and mass-injected (except for partition 8 which only got partially mass injected)
  • after this process, it looks like azure is missing 10% of all objects (excluding partition 8), which are all on banco
    • I've started a procedure to copy the missing objects from banco directly. Estimated time to completion ~ 1 month
    • The same procedure has been started to copy the missing objects from partition 8 on uffizi. Estimated time to completion ~ 15 days
Jun 7 2019, 7:30 PM · General

May 25 2019

zack closed Unknown Object (Maniphest Task), a subtask of T691: complete object storage mirror on Azure (meta task), as Resolved.
May 25 2019, 5:06 PM · General
zack added a parent task for T691: complete object storage mirror on Azure (meta task): T239: preserve at least 2 copies of each content object.
May 25 2019, 5:05 PM · General
zack added a subtask for T239: preserve at least 2 copies of each content object: T691: complete object storage mirror on Azure (meta task).
May 25 2019, 5:05 PM · General
zack added a comment to T691: complete object storage mirror on Azure (meta task).

@olasd recently made a lot of progress on this one.

May 25 2019, 4:56 PM · General

May 23 2019

vlorentz renamed T1102: Handle all GitHub elements from Handle all GitHub elements to Handle all GitHub elements (meta task).
May 23 2019, 12:11 PM · meta-task, General
vlorentz removed a subtask for T1102: Handle all GitHub elements: T833: When listing an origin, add origin level metadata to storage.
May 23 2019, 12:07 PM · meta-task, General
vlorentz removed a project from T1102: Handle all GitHub elements: Restricted Project.
May 23 2019, 11:59 AM · meta-task, General
vlorentz renamed T1102: Handle all GitHub elements from Handle GitHub elements to Handle all GitHub elements .
May 23 2019, 11:58 AM · meta-task, General

May 15 2019

nahimilega closed T808: phabricator lister, a subtask of T807: dogfooding: ingest the Software Heritage forge into the archive (via the canonical URLs), as Resolved.
May 15 2019, 4:42 PM · General

Apr 13 2019

zack added a comment to T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.

For file paths it would be nice to also support steps that use usual file/dir names foo/bar/baz, as a more readable alternative to number-based steps.

Apr 13 2019, 4:52 PM · UX, Web app, General
zack renamed T1241: Persistent identifiers (PIDs): add a way to describe Merkle DAG paths from Describing paths in the Merkle DAG to Persistent identifiers (PIDs): add a way to describe Merkle DAG paths.
Apr 13 2019, 4:47 PM · UX, Web app, General

Mar 12 2019

zack closed T565: embrace repository snapshot object in the data model (meta task) as Resolved.

unless i'm missing something, this has been completed a while ago (if not, please reopen, ideally adding the relevant open sub-task)

Mar 12 2019, 10:10 AM · General

Feb 21 2019

zack closed T1087: facet/metadata-based project search as Resolved.

This is now fixed (by @vlorentz) and deployed.

Feb 21 2019, 8:07 PM · Metadata workflow, General, Web app
vlorentz added a project to T1102: Handle all GitHub elements: Restricted Project.
Feb 21 2019, 10:11 AM · meta-task, General

Jan 15 2019

ardumont closed T359: Indexers: batch content analyzer infrastructure as Resolved.

We need to rework the current indexer implementation to use range instead (T991).
After that, we can schedule 256 ranges of contents to index using the scheduler stack instead.
And see where that goes.

Jan 15 2019, 2:44 PM · Indexer, General

Dec 4 2018

ardumont closed T1227: General improvments of the indexer: Schedule indexer tasks, a subtask of T359: Indexers: batch content analyzer infrastructure, as Resolved.
Dec 4 2018, 11:47 AM · Indexer, General

Nov 27 2018

vlorentz added a parent task for T359: Indexers: batch content analyzer infrastructure: T1385: Monitor output of metadata indexers.
Nov 27 2018, 11:58 AM · Indexer, General

Oct 19 2018

ardumont edited subtasks for T359: Indexers: batch content analyzer infrastructure, added: T1227: General improvments of the indexer: Schedule indexer tasks; removed: T991: Indexers: Send range of ids instead of list of ids.
Oct 19 2018, 8:47 AM · Indexer, General