general Software Heritage product, for issues that cannot be classified more specifically (yet)
Jul 2 2020
This is an important feature: it has been dormant for a while, but we need to actually start implementing it.
Jun 19 2020
Jun 18 2020
Jun 9 2020
May 6 2020
Apr 17 2020
It seems Phabricator reopened the task automatically with my last comment, that was not intended.
@moranegg , for the branch case the anchor will be the revision it points to. For your example, it will be
Apr 16 2020
Just a question about using a path with a different branch, for example for a tag of a version (which is not a release):
- in this case, the anchor is the snp and the branch name (the tag) is in the path?
Mar 30 2020
This is now done in the few commits leading to https://forge.softwareheritage.org/rDMODaccca603c42ad68252532222ca6467a19691524e
Mar 27 2020
Mar 24 2020
@zack thanks for spotting the missing pieces... now fixed in the description, we're ready to go! :-)
Would you take care of extending the definition in https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html ?
@rdicosmo: the current version of the full example above LGTM (the surrounding text is inconsistent, e.g., it still mentions "snp" as a key and forbids snapshot anchors, but I suspect it's just that you didn't bother editing everything. Hence, we're good! :-))
Mar 23 2020
Update the proposal with visit instead of snp
About the anchor point: no objection to having also shapshot as a possible anchor in the schema.
(removed the last point, the hierarchy thing is in fact not relevant here, as we're pointing upward, not downward)
LGTM in general.
As part of the discussion about the revamped UX, we have simplified the proposal for describing paths in the Merkle DAG. When the anchor denotes a revision (and most often when it's a release), it's trivial to find in the DAG the root directory of the source code, and we only need the file path to identify the content we are interested in. When it's a snapshot, there is a default root directory to point to.
Jan 23 2020
This is done in all loaders by now.
Jan 22 2020
Nov 7 2019
Jun 18 2019
Jun 17 2019
resolved (by T691)
After processing the logs of the backfilling process to make sure to redo all the ranges that were interrupted in various database migrations, I'm now confident that this task is complete: we have a full mirror of all contents on Azure, which is kept up to date by the main archive storage backend writing synchronously to it.
Jun 7 2019
- The main archive currently synchronously writes all contents to Azure as well as the local storage (the gap is strictly closing)
- all partitions from uffizi have been copied to azure and mass-injected (except for partition 8 which only got partially mass injected)
- after this process, it looks like azure is missing 10% of all objects (excluding partition 8), which are all on banco
- I've started a procedure to copy the missing objects from banco directly. Estimated time to completion ~ 1 month
- The same procedure has been started to copy the missing objects from partition 8 on uffizi. Estimated time to completion ~ 15 days
May 25 2019
@olasd recently made a lot of progress on this one.
May 23 2019
May 15 2019
Apr 13 2019
For file paths it would be nice to also support steps that use usual file/dir names foo/bar/baz, as a more readable alternative to number-based steps.
Mar 12 2019
unless i'm missing something, this has been completed a while ago (if not, please reopen, ideally adding the relevant open sub-task)
Feb 21 2019
This is now fixed (by @vlorentz) and deployed.
Jan 15 2019
We need to rework the current indexer implementation to use range instead (T991).
After that, we can schedule 256 ranges of contents to index using the scheduler stack instead.
And see where that goes.