ardumont retitled D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function from Factorize common loader behavior within from_disk.iter_directory to Factorize common loader behavior within swh.model.from_disk.iter_directory function.

Jul 2 2020, 12:23 PM

ardumont updated the summary of D3390: Unify object_type some more within the merkle and from_disk modules.

Jul 2 2020, 12:12 PM

ardumont updated the summary of D3390: Unify object_type some more within the merkle and from_disk modules.

Jul 2 2020, 12:12 PM

Harbormaster failed remote builds in B13256: Diff 12036 for D3393: loader.svn: Reuse swh.model.from_disk.iter_directory function!

Jul 2 2020, 12:09 PM

swh-public-ci added a comment to D3393: loader.svn: Reuse swh.model.from_disk.iter_directory function.

Build has FAILED

Jul 2 2020, 12:09 PM

ardumont created D3393: loader.svn: Reuse swh.model.from_disk.iter_directory function.

Jul 2 2020, 12:08 PM

zack added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

@civodul I wanted to raise the topic of storing container metadata (in the style of what tools like pristine-tar do) here too, so thanks for giving me the chance :-)
I agree it might be a technical solution, *but*, I'm not sure I see the point.
Didn't you agree that having a "lookup service" from tarball/container checksums to SWHIDs (the Software Heritage identifiers, that can then be used to lookup stuff in the archive) would be enough to satisfy distro needs?
If yes, then "archiving container metadata" could be replaced by simply having a way to add entries to the lookup table. And allowing distros to do so is option that we can explore. (Once the service exists, of course.)

Jul 2 2020, 12:07 PM · Data Model

Harbormaster failed remote builds in B13255: Diff 12035 for D3392: loader-core: Reuse swh.model.from_disk.iter_directory function!

Jul 2 2020, 12:06 PM

swh-public-ci added a comment to D3392: loader-core: Reuse swh.model.from_disk.iter_directory function.

Build has FAILED

Jul 2 2020, 12:06 PM

ardumont created D3392: loader-core: Reuse swh.model.from_disk.iter_directory function.

Jul 2 2020, 12:05 PM

swh-public-ci added a comment to D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function.

Build is green

Jul 2 2020, 12:04 PM

ardumont updated the diff for D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function.

Make the returned results a tuple of lists

Jul 2 2020, 12:02 PM

civodul added a comment to T2430: lookup ingested tarballs (or similar source code containers) by container checksum.

Do I get it right that the primary reason why tarballs aren't systematically archived is that doing so would be too expensive storage-wise (no deduplication)?

Jul 2 2020, 12:00 PM · Data Model

swh-public-ci added a comment to D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function.

Build is green

Jul 2 2020, 12:00 PM

ardumont created D3391: Refactor common loader behavior within swh.model.from_disk.iter_directory function.

Jul 2 2020, 11:58 AM

anlambert committed rDWAPPS76d9162807e0: origin_visits/get_origin_visit: Improve default visit picking strategy (authored by anlambert).

origin_visits/get_origin_visit: Improve default visit picking strategy

Jul 2 2020, 11:39 AM

anlambert closed D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Jul 2 2020, 11:39 AM

swh-public-ci added a comment to D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Build is green

Jul 2 2020, 11:35 AM

ardumont updated the summary of D3390: Unify object_type some more within the merkle and from_disk modules.

Jul 2 2020, 11:24 AM

anlambert updated the diff for D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Update: Improve tests implementation

Jul 2 2020, 11:24 AM

ardumont added a comment to D3383: Implement {directory,revision,release,snapshot}_metadata_{add,get}..

I'll rewrite this using swh-model and only two endpoints for all types

Jul 2 2020, 11:22 AM

ardumont updated the summary of D3390: Unify object_type some more within the merkle and from_disk modules.

Jul 2 2020, 11:17 AM

vlorentz abandoned D3383: Implement {directory,revision,release,snapshot}_metadata_{add,get}..

I'll rewrite this using swh-model and only two endpoints for all types

Jul 2 2020, 11:11 AM

swh-public-ci added a comment to D3390: Unify object_type some more within the merkle and from_disk modules.

Build is green

Jul 2 2020, 11:10 AM

ardumont created D3390: Unify object_type some more within the merkle and from_disk modules.

Jul 2 2020, 11:09 AM

vlorentz closed D3382: Move tests of content_metadata_* next to origin_metadata_*.

Landed as 248c277445adbae5813ba80ce0618858d8126634.

Jul 2 2020, 11:05 AM

vlorentz committed rDSTO248c277445ad: Move tests of content_metadata_* next to origin_metadata_* (authored by vlorentz).

Move tests of content_metadata_* next to origin_metadata_*

Jul 2 2020, 11:04 AM

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 2 2020, 10:32 AM · Storage manager, Data Model

vlorentz abandoned D3247: [WIP] Add content_metadata_{add,get}..

Replaced by linked diffs

Jul 2 2020, 10:12 AM

Jul 1 2020

ardumont accepted D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

nice ;)

Jul 1 2020, 8:21 PM

swh-public-ci added a comment to D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Build is green

Jul 1 2020, 6:26 PM

anlambert updated the diff for D3385: origin_visits/get_origin_visit: Improve default visit picking strategy.

Use new swh.storage.algos.origin.origin_get_latest_visit_status utility function in revised get_origin_visit implementation

Jul 1 2020, 6:15 PM

swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build is green

Jul 1 2020, 6:03 PM

douardda updated the diff for D3389: Extract the extra_headers from metadata on the Revision model class.

more mypy vs. attrs-strict fighting

Jul 1 2020, 6:02 PM

Harbormaster failed remote builds in B13246: Diff 12027 for D3389: Extract the extra_headers from metadata on the Revision model class!

Jul 1 2020, 5:41 PM

swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build has FAILED

Jul 1 2020, 5:41 PM

douardda updated the diff for D3389: Extract the extra_headers from metadata on the Revision model class.

make mypy happy (hopefully)

Jul 1 2020, 5:39 PM

moranegg triaged T2472: Indexing intrinsic metadata in a deposit using a sub-folder for the content as Normal priority.

Jul 1 2020, 5:35 PM · Intrinsic metadata, Indexer, SWORD deposit

Harbormaster failed remote builds in B13245: Diff 12026 for D3389: Extract the extra_headers from metadata on the Revision model class!

Jul 1 2020, 5:23 PM

swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build has FAILED

Jul 1 2020, 5:23 PM

douardda updated the diff for D3389: Extract the extra_headers from metadata on the Revision model class.

restrict extra_headers to (bytes, bytes) only

Jul 1 2020, 5:23 PM

olasd edited P708 non bytes extra_headers.

Jul 1 2020, 5:16 PM

olasd updated the language for P708 non bytes extra_headers from autodetect to remarkup.

Jul 1 2020, 5:12 PM

olasd created P708 non bytes extra_headers.

Jul 1 2020, 5:12 PM

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 1 2020, 4:22 PM · Storage manager, Data Model

Harbormaster failed remote builds in B13244: Diff 12025 for D3389: Extract the extra_headers from metadata on the Revision model class!

Jul 1 2020, 4:05 PM

swh-public-ci added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

Build has FAILED

Jul 1 2020, 4:05 PM

douardda updated the diff for D3389: Extract the extra_headers from metadata on the Revision model class.

improve bw-compat support, tests and hypothesis strategies

Jul 1 2020, 4:04 PM

rdicosmo added a comment to T2344: Build a connector for software deposit via Zenodo/InvenioRDM.

In T2344#43031, @moranegg wrote:

Great news !!

Does this mean we need to be SWORD 3 compatible?

Jul 1 2020, 4:03 PM · meta-task, Roadmap 2022, Roadmap 2020, SWORD deposit, Scientific Community Building

ardumont updated the task description for T2310: Make origin visits immutable.

Jul 1 2020, 4:01 PM · Storage manager, Data Model

ardumont committed rDJNLf6ab436d3934: requirements-swh: Add missing version requirement on swh.model (authored by ardumont).

requirements-swh: Add missing version requirement on swh.model

Jul 1 2020, 4:01 PM

olasd added a comment to D3389: Extract the extra_headers from metadata on the Revision model class.

I'm not sure about popping the extra_headers off of the incoming metadata field right now. This feels like something we want to do long term, but if you do that right now, it means you're putting upgrades of swh.model and swh.storage (with support with the new field) and all loaders in lockstep of one another. If you upgrade swh.model now, you'll lose the extra_headers until swh.storage will be able to store them.

Jul 1 2020, 4:00 PM

ardumont abandoned D3376: journal_data: Drop obsolete origin_visit fields.

Jul 1 2020, 3:57 PM

ardumont added a comment to D3376: journal_data: Drop obsolete origin_visit fields.

Landed in eda756252f2151ba602ae47646e35fab4a83ebc1

Jul 1 2020, 3:57 PM

ardumont committed rDJNLeda756252f21: journal_data: Drop obsolete origin_visit fields (authored by ardumont).

journal_data: Drop obsolete origin_visit fields

Jul 1 2020, 3:56 PM

swh-public-ci added a comment to D3376: journal_data: Drop obsolete origin_visit fields.

Build is green

Jul 1 2020, 3:55 PM